03. Convolutional Neural Networks and Computer Vision with TensorFlow¶
We've done some basic TensorFlow stuff, along with some models. Let's get specific and learn a special kind of neural network called convolutional neural networks (CNNs), which helps detect patterns in visual data.
Note: Many different kinds of model architecture can be used for different problems in deep learning. You can use CNN in image data and text data, though there are some architectures that work better in other problems. So don't rely on one specific method.
For example:
- Classify whether a picture is a pizza 🍕 or steak 🥩 (we'd be doing this)
- Detect whether an object appears in an image (did a car just pass by the dashcam?)
What's covered:¶
- Getting a dataset to work with
- Architecture of a CNN (Convolutional Neural Network)
- A quick end-to-end example (what we're working towards)
- Steps in modelling for binary image classification in CNN
- Becoming one with the data
- Prepearing data for modelling
- Creating a CNN model (starting with a baseline)
- Fitting a model (getting it to find patterns in our data)
- Evaluating a model
- Improving a model
- Making a prediction with a trained model
- Steps in modelling for multi class image classification with CNN
- Same as above (but with a different dataset)
import datetime
print(f"Notebook last run (end-to-end): {datetime.datetime.now()}")
Notebook last run (end-to-end): 2025-05-21 10:08:00.984057
Get the data¶
CNNs work very well with image data, and we're going to use image dataset to learn about them.
Image dataset will come from Food-101, a collection of 101 different categories of 101,000 real-world images of food dishes.
To begin, we'll create a binary classifier for pizza 🍕 and steak 🥩
there is a zip file created for pizza and steak, so no pre-processing is currently needed now.
import zipfile
import urllib.request
# Step 1: Download the zip file
url = "https://storage.googleapis.com/ztm_tf_course/food_vision/pizza_steak.zip"
urllib.request.urlretrieve(url, "pizza_steak.zip") # saves the file locally
# Step 2: Unzip the file
with zipfile.ZipFile("pizza_steak.zip", "r") as zip_ref:
zip_ref.extractall() # extract all files to the current directory
Inpect the data (become one with it)¶
Its a very crucial step, where it usually is meant by visualizing and scanning through the folder of data you're working with. Try and understand what it is you're dealing with.
The file structure is formatted to what you may typically see when working with image datasets.
Example of what it looks like:
pizza_steak <- top level folder
└───train <- training images
│ └───pizza
│ │ │ 1008104.jpg
│ │ │ 1638227.jpg
│ │ │ ...
│ └───steak
│ │ 1000205.jpg
│ │ 1647351.jpg
│ │ ...
│
└───test <- testing images
│ └───pizza
│ │ │ 1001116.jpg
│ │ │ 1507019.jpg
│ │ │ ...
│ └───steak
│ │ 100274.jpg
│ │ 1653815.jpg
│ │ ...
Let's inspect the directory, which can be done with listdir, short for list directory. Though first, we must import os
import os
os.listdir('pizza_steak')
['test', 'train']
os.listdir('pizza_steak/train/')
['steak', 'pizza']
files = os.listdir('pizza_steak/train/steak/')
cols = 10
# Print files in a grid
for i in range(0, len(files), cols):
print(" ".join(files[i:i+cols]))
239025.jpg 1155665.jpg 3007772.jpg 1598345.jpg 658189.jpg 172936.jpg 3807440.jpg 168775.jpg 331860.jpg 2939678.jpg 2173084.jpg 1327667.jpg 468384.jpg 3074367.jpg 1487113.jpg 2568848.jpg 143490.jpg 2233395.jpg 3009617.jpg 2995169.jpg 1567554.jpg 268444.jpg 1403005.jpg 637374.jpg 2390628.jpg 2172600.jpg 2761427.jpg 3621464.jpg 3540750.jpg 134369.jpg 2832499.jpg 3253588.jpg 2291292.jpg 3260624.jpg 2090493.jpg 1219039.jpg 140832.jpg 955466.jpg 168006.jpg 2691461.jpg 1658443.jpg 786409.jpg 669180.jpg 3700079.jpg 1777107.jpg 2412263.jpg 2940621.jpg 1524526.jpg 3750472.jpg 3470083.jpg 413325.jpg 2979061.jpg 1093966.jpg 914570.jpg 2984311.jpg 361067.jpg 4176.jpg 669960.jpg 2779040.jpg 3393547.jpg 3829392.jpg 3556871.jpg 2394465.jpg 1971757.jpg 1548239.jpg 167069.jpg 217250.jpg 231296.jpg 3110387.jpg 1100074.jpg 101312.jpg 614975.jpg 3770370.jpg 3916407.jpg 1147883.jpg 2532239.jpg 417368.jpg 2724554.jpg 640539.jpg 3396589.jpg 644777.jpg 3002350.jpg 568972.jpg 2088195.jpg 2133717.jpg 1636831.jpg 763690.jpg 1987639.jpg 1907039.jpg 2916448.jpg 3382936.jpg 2232310.jpg 3180182.jpg 3671877.jpg 2324994.jpg 3047807.jpg 2940544.jpg 2890573.jpg 2603058.jpg 1961025.jpg 1615395.jpg 3315727.jpg 734445.jpg 3434983.jpg 3766099.jpg 2606444.jpg 285045.jpg 2738227.jpg 401651.jpg 3531805.jpg 356234.jpg 355715.jpg 2163079.jpg 1049459.jpg 1736968.jpg 3890465.jpg 2081995.jpg 388776.jpg 910672.jpg 726083.jpg 1375640.jpg 290850.jpg 1105280.jpg 2760475.jpg 444709.jpg 2878151.jpg 314359.jpg 1559052.jpg 2826987.jpg 2223787.jpg 234704.jpg 854150.jpg 1619357.jpg 2909031.jpg 1976160.jpg 2045647.jpg 1846706.jpg 141135.jpg 401094.jpg 1595869.jpg 3524429.jpg 3346787.jpg 3425047.jpg 2283057.jpg 911803.jpg 632427.jpg 3777020.jpg 762210.jpg 2643906.jpg 1236155.jpg 2535431.jpg 405794.jpg 1313316.jpg 126345.jpg 1382427.jpg 2138335.jpg 31881.jpg 2495903.jpg 2440131.jpg 2548974.jpg 987732.jpg 2490489.jpg 3670607.jpg 588739.jpg 830007.jpg 2374582.jpg 1344105.jpg 442757.jpg 2549316.jpg 525041.jpg 1849542.jpg 141056.jpg 256592.jpg 2088030.jpg 179293.jpg 1888450.jpg 3333735.jpg 1829088.jpg 1987213.jpg 3663518.jpg 534560.jpg 1324791.jpg 38442.jpg 1446401.jpg 2938012.jpg 3547166.jpg 252858.jpg 295491.jpg 1658186.jpg 2592401.jpg 1645470.jpg 1618011.jpg 1068975.jpg 2437268.jpg 1724717.jpg 3787809.jpg 270687.jpg 2159975.jpg 1343209.jpg 127029.jpg 2323132.jpg 2403907.jpg 1678108.jpg 2561199.jpg 813486.jpg 6926.jpg 3364420.jpg 1836332.jpg 931356.jpg 3528458.jpg 752203.jpg 3082120.jpg 358042.jpg 728020.jpg 979110.jpg 926414.jpg 3136.jpg 3142674.jpg 262321.jpg 2404884.jpg 2535456.jpg 40762.jpg 1969596.jpg 612551.jpg 2929179.jpg 10380.jpg 165964.jpg 1327567.jpg 1032846.jpg 184226.jpg 2062248.jpg 1966300.jpg 3621565.jpg 215222.jpg 2653594.jpg 1098844.jpg 3868959.jpg 3812039.jpg 1445352.jpg 2032669.jpg 3008192.jpg 187303.jpg 3777482.jpg 393494.jpg 676189.jpg 1225762.jpg 2936477.jpg 2287136.jpg 1367035.jpg 401144.jpg 1941807.jpg 1623325.jpg 2544643.jpg 1476404.jpg 3116018.jpg 229323.jpg 1392718.jpg 2859933.jpg 3030578.jpg 1816235.jpg 1053665.jpg 3388717.jpg 3162376.jpg 233964.jpg 2300845.jpg 1334054.jpg 2823872.jpg 812163.jpg 3143192.jpg 2286639.jpg 767442.jpg 146833.jpg 3606642.jpg 1241193.jpg 3113772.jpg 523535.jpg 3478318.jpg 2011264.jpg 1752330.jpg 32693.jpg 2395127.jpg 75537.jpg 802348.jpg 1870942.jpg 3693649.jpg 1714605.jpg 3867460.jpg 3274423.jpg 1000205.jpg 564530.jpg 2807888.jpg 1684438.jpg 952437.jpg 2599817.jpg 22080.jpg 636594.jpg 149682.jpg 358045.jpg 3664376.jpg 1512347.jpg 3671021.jpg 3095301.jpg 908261.jpg 3663800.jpg 461689.jpg 240435.jpg 3790962.jpg 3241894.jpg 3000131.jpg 2855315.jpg 3752362.jpg 2455944.jpg 1632774.jpg 1927984.jpg 3173444.jpg 1575322.jpg 2538000.jpg 1945132.jpg 1594719.jpg 3191589.jpg 1550997.jpg 1512226.jpg 9555.jpg 3094354.jpg 819027.jpg 2315295.jpg 2344227.jpg 541410.jpg 285147.jpg 203450.jpg 937133.jpg 1456841.jpg 2140776.jpg 2966859.jpg 482465.jpg 2881783.jpg 576725.jpg 183995.jpg 1746626.jpg 2038418.jpg 3322909.jpg 149087.jpg 3576078.jpg 320658.jpg 674001.jpg 3020591.jpg 703909.jpg 3743286.jpg 1539499.jpg 2268692.jpg 2260231.jpg 3171085.jpg 1823263.jpg 3492328.jpg 398288.jpg 2622140.jpg 3488748.jpg 3621562.jpg 2964732.jpg 443210.jpg 1508094.jpg 100135.jpg 2534567.jpg 2012996.jpg 483788.jpg 3330642.jpg 735441.jpg 3059843.jpg 2824680.jpg 1346387.jpg 3223400.jpg 3727491.jpg 2832960.jpg 2573392.jpg 214320.jpg 864997.jpg 1697339.jpg 368162.jpg 2271133.jpg 2928643.jpg 3830872.jpg 1081258.jpg 339891.jpg 3623556.jpg 923772.jpg 823766.jpg 1675632.jpg 663014.jpg 2437843.jpg 354329.jpg 2407770.jpg 2129685.jpg 3704103.jpg 2002400.jpg 664545.jpg 2912290.jpg 6709.jpg 2425062.jpg 3518960.jpg 493029.jpg 3130412.jpg 1606596.jpg 2818805.jpg 176508.jpg 3716881.jpg 80215.jpg 1421393.jpg 982988.jpg 250978.jpg 3223601.jpg 1787505.jpg 2125877.jpg 3788729.jpg 2357281.jpg 1682496.jpg 1209120.jpg 1925230.jpg 2526838.jpg 3142045.jpg 885571.jpg 1333055.jpg 1264858.jpg 2230959.jpg 1828502.jpg 2884233.jpg 952407.jpg 1930577.jpg 3140083.jpg 160552.jpg 644867.jpg 296268.jpg 3435358.jpg 3613455.jpg 2706403.jpg 2222018.jpg 996684.jpg 2300534.jpg 3609394.jpg 2499364.jpg 317206.jpg 3538682.jpg 1208405.jpg 838344.jpg 2862562.jpg 2034628.jpg 56240.jpg 3577618.jpg 2400975.jpg 536535.jpg 1163977.jpg 2707522.jpg 3438319.jpg 379737.jpg 1264050.jpg 3591821.jpg 1598885.jpg 1541672.jpg 2017387.jpg 3601483.jpg 2087958.jpg 3522209.jpg 2404695.jpg 15580.jpg 2628106.jpg 114601.jpg 2977966.jpg 756655.jpg 1190233.jpg 2927833.jpg 1351372.jpg 2050584.jpg 438871.jpg 2614189.jpg 961341.jpg 60655.jpg 1230968.jpg 3444407.jpg 1826066.jpg 2989882.jpg 2648423.jpg 1889336.jpg 2615718.jpg 187521.jpg 2435316.jpg 628628.jpg 2629750.jpg 368170.jpg 2916967.jpg 3386119.jpg 3334973.jpg 2052542.jpg 165639.jpg 1337814.jpg 1822407.jpg 2711806.jpg 3209173.jpg 1257104.jpg 2563233.jpg 2154126.jpg 2193684.jpg 326587.jpg 217996.jpg 1995118.jpg 1395906.jpg 234626.jpg 1312841.jpg 2668916.jpg 3099645.jpg 336637.jpg 2988960.jpg 745189.jpg 386335.jpg 1295457.jpg 2238681.jpg 1849463.jpg 313851.jpg 248841.jpg 3653129.jpg 3393688.jpg 1995252.jpg 853327.jpg 421476.jpg 2543081.jpg 2446660.jpg 1772039.jpg 3869679.jpg 2547797.jpg 704316.jpg 3335267.jpg 2893832.jpg 503589.jpg 1335556.jpg 660900.jpg 3100476.jpg 3298495.jpg 1428947.jpg 3855584.jpg 2216146.jpg 804684.jpg 2331076.jpg 2949079.jpg 2013535.jpg 714298.jpg 1213988.jpg 2768451.jpg 2509017.jpg 1839025.jpg 2042975.jpg 1624450.jpg 1942333.jpg 1853564.jpg 2392910.jpg 3245622.jpg 3578934.jpg 3247009.jpg 667075.jpg 1400760.jpg 2965021.jpg 2788759.jpg 3895825.jpg 945791.jpg 3469024.jpg 1628861.jpg 1761285.jpg 720060.jpg 43924.jpg 1588879.jpg 1362989.jpg 2716993.jpg 1289900.jpg 3536023.jpg 134598.jpg 2614649.jpg 2788312.jpg 2815172.jpg 3245533.jpg 2425389.jpg 2090504.jpg 947877.jpg 2495884.jpg 543691.jpg 980247.jpg 2880035.jpg 1647351.jpg 1736543.jpg 2403776.jpg 3571963.jpg 2146963.jpg 534633.jpg 2748917.jpg 286219.jpg 148916.jpg 1552530.jpg 330182.jpg 1724387.jpg 703556.jpg 97656.jpg 184110.jpg 332232.jpg 1248337.jpg 3724677.jpg 3204977.jpg 1650002.jpg 3736065.jpg 389739.jpg 1126126.jpg 2625330.jpg 3389138.jpg 2651300.jpg 3656752.jpg 3011642.jpg 1937872.jpg 2154779.jpg 681609.jpg 3577732.jpg 3466159.jpg 60633.jpg 2238802.jpg 822550.jpg 3157832.jpg 2365287.jpg 1433912.jpg 280284.jpg 3894222.jpg 1600179.jpg 922752.jpg 447557.jpg 1485083.jpg 37384.jpg 1600794.jpg 1829045.jpg 732986.jpg 2996324.jpg 2036920.jpg 3476564.jpg 465494.jpg 3727036.jpg 213765.jpg 606820.jpg 2765887.jpg 2396291.jpg 1984271.jpg 345734.jpg 2487306.jpg 2910418.jpg 1090122.jpg 332557.jpg 3271253.jpg 2514432.jpg 3280453.jpg 3333128.jpg 227576.jpg 3417789.jpg 3465327.jpg 2056627.jpg 1828969.jpg 56409.jpg 2796102.jpg 3128952.jpg 1453991.jpg 2136662.jpg 2254705.jpg 513842.jpg 368073.jpg 91432.jpg 3372616.jpg 1340977.jpg 199754.jpg 377190.jpg 1021458.jpg 1624747.jpg 1710569.jpg 2983260.jpg 1413972.jpg 2500292.jpg 3460673.jpg 1621763.jpg 1493169.jpg 616809.jpg 405173.jpg 3553911.jpg 440188.jpg 220341.jpg 2825100.jpg 2938151.jpg 2361812.jpg 1290362.jpg 2771149.jpg 1348047.jpg 1404770.jpg 2020613.jpg 560503.jpg 1371177.jpg 3375959.jpg 359330.jpg 1530833.jpg 225990.jpg 3142618.jpg 3306627.jpg 3168620.jpg 3140147.jpg 42125.jpg 3792514.jpg 2489716.jpg 561972.jpg 2327701.jpg 3326734.jpg 1849364.jpg 2856066.jpg 2458401.jpg 3381560.jpg 740090.jpg 1839481.jpg 1264154.jpg 2893892.jpg 1147047.jpg 444064.jpg 477486.jpg 1966967.jpg 2907177.jpg 1563266.jpg 2644457.jpg 2018173.jpg 827764.jpg 1869467.jpg 1212161.jpg 393349.jpg 907107.jpg 3781152.jpg 421561.jpg 510757.jpg 3640915.jpg 3707493.jpg 461187.jpg 590142.jpg 2443168.jpg 482022.jpg 3643951.jpg 1117936.jpg 2865730.jpg 40094.jpg 381162.jpg 513129.jpg 2619625.jpg 3745515.jpg 3857508.jpg 3159818.jpg 1068516.jpg 2661577.jpg 3335013.jpg
files = os.listdir('pizza_steak/train/pizza/')
cols=10
for i in range(0,len(files),cols):
print(" ".join(files[i:i+cols]))
2577377.jpg 102037.jpg 384215.jpg 1033251.jpg 2312987.jpg 3057192.jpg 2501961.jpg 132484.jpg 1888911.jpg 3426946.jpg 2572958.jpg 2700543.jpg 143453.jpg 166823.jpg 670201.jpg 2667255.jpg 1763205.jpg 387697.jpg 375401.jpg 89892.jpg 29417.jpg 1083380.jpg 1663749.jpg 1499661.jpg 2293453.jpg 448519.jpg 928670.jpg 169720.jpg 1906287.jpg 2021516.jpg 1686908.jpg 792093.jpg 1839077.jpg 1325918.jpg 1207213.jpg 2126709.jpg 3170114.jpg 3066951.jpg 1968947.jpg 816577.jpg 1504421.jpg 2778214.jpg 951953.jpg 169318.jpg 2410138.jpg 3473991.jpg 1786840.jpg 2241448.jpg 1934355.jpg 63480.jpg 2952219.jpg 300869.jpg 1576248.jpg 299535.jpg 3434372.jpg 3256974.jpg 1617418.jpg 3910117.jpg 618021.jpg 3702863.jpg 2881282.jpg 2821034.jpg 2885796.jpg 2104569.jpg 2097315.jpg 2902766.jpg 3456440.jpg 3392671.jpg 369017.jpg 131561.jpg 724445.jpg 2739100.jpg 489347.jpg 1717790.jpg 1681043.jpg 2467990.jpg 3366256.jpg 709273.jpg 962785.jpg 3574192.jpg 3478964.jpg 2291093.jpg 1089334.jpg 368644.jpg 768276.jpg 1705747.jpg 3550805.jpg 3493457.jpg 1454995.jpg 121834.jpg 1287004.jpg 869763.jpg 3589437.jpg 3505182.jpg 762788.jpg 979955.jpg 3379038.jpg 1008941.jpg 3766476.jpg 1593835.jpg 1881674.jpg 2110257.jpg 1524655.jpg 1654444.jpg 3376519.jpg 59445.jpg 271675.jpg 1055065.jpg 928663.jpg 3382880.jpg 2292986.jpg 3023774.jpg 2462190.jpg 52934.jpg 1878005.jpg 2032236.jpg 2253670.jpg 1899562.jpg 13983.jpg 2187466.jpg 2569760.jpg 2827938.jpg 1054420.jpg 568383.jpg 3392649.jpg 452989.jpg 398565.jpg 2924941.jpg 663285.jpg 1951130.jpg 3383977.jpg 422261.jpg 238843.jpg 3164761.jpg 3798959.jpg 2722646.jpg 147785.jpg 3391208.jpg 2606727.jpg 3173779.jpg 2622336.jpg 2255361.jpg 576236.jpg 3185774.jpg 1423515.jpg 3464027.jpg 1806491.jpg 3742272.jpg 2687575.jpg 2231356.jpg 317861.jpg 1138936.jpg 2476468.jpg 2967846.jpg 2667244.jpg 1912976.jpg 1998483.jpg 259449.jpg 545561.jpg 1584379.jpg 1553353.jpg 2519291.jpg 3462250.jpg 1026922.jpg 3314176.jpg 604977.jpg 3401720.jpg 3629996.jpg 3803596.jpg 3871666.jpg 352051.jpg 3475936.jpg 3549765.jpg 2077999.jpg 2844278.jpg 2382016.jpg 2501636.jpg 759873.jpg 3554287.jpg 2155735.jpg 1742542.jpg 884986.jpg 626902.jpg 966644.jpg 1900585.jpg 358178.jpg 2161241.jpg 2421445.jpg 12718.jpg 1069629.jpg 3397336.jpg 3830773.jpg 3479936.jpg 3546278.jpg 320570.jpg 3196721.jpg 2577373.jpg 1898723.jpg 1608000.jpg 2481333.jpg 1828050.jpg 218711.jpg 1649276.jpg 665900.jpg 3663580.jpg 2671508.jpg 2091857.jpg 1351631.jpg 168879.jpg 1008844.jpg 937915.jpg 698251.jpg 517902.jpg 1425089.jpg 3913912.jpg 1289139.jpg 276803.jpg 1743389.jpg 54461.jpg 1981348.jpg 2172850.jpg 34632.jpg 2855844.jpg 175626.jpg 513754.jpg 1636299.jpg 790432.jpg 1285298.jpg 1008104.jpg 2441328.jpg 2224099.jpg 2723529.jpg 2121603.jpg 349946.jpg 2534774.jpg 242813.jpg 3712344.jpg 2188452.jpg 2126352.jpg 1757288.jpg 3082068.jpg 2995731.jpg 809024.jpg 819547.jpg 3793314.jpg 1403878.jpg 474493.jpg 1412034.jpg 976382.jpg 2885050.jpg 2010437.jpg 403431.jpg 3822139.jpg 3826377.jpg 3693710.jpg 340814.jpg 877881.jpg 1426781.jpg 1973447.jpg 3039549.jpg 2490163.jpg 23199.jpg 1413289.jpg 816729.jpg 1958364.jpg 1535273.jpg 2576168.jpg 248252.jpg 709947.jpg 899959.jpg 1270986.jpg 918506.jpg 1950499.jpg 2694223.jpg 1625147.jpg 741491.jpg 608085.jpg 2511911.jpg 812349.jpg 2640502.jpg 3653643.jpg 3425999.jpg 1890444.jpg 83538.jpg 2190018.jpg 1577871.jpg 1620761.jpg 326809.jpg 875856.jpg 1897129.jpg 3660716.jpg 2215531.jpg 3767723.jpg 2666066.jpg 82578.jpg 2811032.jpg 3082443.jpg 1123386.jpg 514014.jpg 1836888.jpg 3193599.jpg 1593665.jpg 3530210.jpg 3367113.jpg 1269960.jpg 282013.jpg 2285269.jpg 2035248.jpg 1390308.jpg 3168266.jpg 2760984.jpg 2432061.jpg 861771.jpg 1687681.jpg 1507039.jpg 1600705.jpg 872094.jpg 898303.jpg 896448.jpg 3790235.jpg 2473559.jpg 467986.jpg 3862243.jpg 2516510.jpg 3338774.jpg 3000535.jpg 2959665.jpg 3763593.jpg 370643.jpg 674188.jpg 667309.jpg 424288.jpg 489532.jpg 886505.jpg 1944600.jpg 3614525.jpg 1635386.jpg 893644.jpg 1907713.jpg 3864383.jpg 1678284.jpg 803243.jpg 1047561.jpg 3555299.jpg 2739039.jpg 2769168.jpg 3393898.jpg 2755875.jpg 1044524.jpg 2877565.jpg 1351146.jpg 1620560.jpg 2757327.jpg 2443498.jpg 1157438.jpg 3128495.jpg 938821.jpg 395034.jpg 262133.jpg 2486277.jpg 3055697.jpg 875262.jpg 214728.jpg 3512070.jpg 2951831.jpg 1504719.jpg 898843.jpg 3464858.jpg 2849924.jpg 3281494.jpg 3699992.jpg 1327402.jpg 2026009.jpg 132554.jpg 1984976.jpg 1247645.jpg 1646974.jpg 395960.jpg 652004.jpg 44449.jpg 233143.jpg 72716.jpg 2933332.jpg 764429.jpg 759025.jpg 3384856.jpg 2148129.jpg 394049.jpg 2247711.jpg 668944.jpg 1183278.jpg 2112757.jpg 56449.jpg 271779.jpg 382829.jpg 2487039.jpg 741883.jpg 2274117.jpg 1284978.jpg 1633289.jpg 1097980.jpg 138855.jpg 1964051.jpg 77677.jpg 3314535.jpg 3484590.jpg 2235981.jpg 2999507.jpg 2916034.jpg 2754150.jpg 1029698.jpg 1084888.jpg 1165451.jpg 3398309.jpg 350358.jpg 228778.jpg 712149.jpg 3441394.jpg 2785084.jpg 3401767.jpg 2711828.jpg 1708197.jpg 1649108.jpg 967694.jpg 5764.jpg 715169.jpg 682201.jpg 1573562.jpg 419516.jpg 1761451.jpg 755968.jpg 8917.jpg 3102271.jpg 1988629.jpg 702165.jpg 3399610.jpg 1243215.jpg 904938.jpg 2821048.jpg 626170.jpg 839461.jpg 2470671.jpg 333985.jpg 2922019.jpg 1399531.jpg 216720.jpg 32666.jpg 1524599.jpg 3191035.jpg 2570329.jpg 598381.jpg 40231.jpg 2321465.jpg 2448844.jpg 1048649.jpg 662526.jpg 38349.jpg 2412237.jpg 647215.jpg 717350.jpg 2330965.jpg 220910.jpg 3644733.jpg 3084957.jpg 2800325.jpg 274945.jpg 3772054.jpg 2556273.jpg 676432.jpg 2304021.jpg 3042454.jpg 898119.jpg 1895479.jpg 329302.jpg 1110966.jpg 1407753.jpg 413789.jpg 3653528.jpg 61822.jpg 3557127.jpg 338838.jpg 271592.jpg 739735.jpg 899818.jpg 1795316.jpg 3778801.jpg 917774.jpg 3675128.jpg 1688838.jpg 2142812.jpg 625687.jpg 401979.jpg 1202925.jpg 721383.jpg 2228322.jpg 1088332.jpg 1143057.jpg 2078208.jpg 2774899.jpg 3020376.jpg 337272.jpg 1336882.jpg 3906901.jpg 244505.jpg 3326344.jpg 1260554.jpg 2560539.jpg 929067.jpg 3109486.jpg 532970.jpg 985164.jpg 1326065.jpg 2965.jpg 134462.jpg 3704879.jpg 393658.jpg 2236914.jpg 2852301.jpg 82772.jpg 3312584.jpg 3917951.jpg 774142.jpg 970073.jpg 1248346.jpg 1871498.jpg 2412970.jpg 2980131.jpg 1468795.jpg 1159797.jpg 853441.jpg 1035854.jpg 372275.jpg 495892.jpg 1065078.jpg 105910.jpg 1038357.jpg 2992084.jpg 543556.jpg 2491110.jpg 1914969.jpg 1660415.jpg 866834.jpg 2502234.jpg 179165.jpg 878377.jpg 376417.jpg 868789.jpg 3264148.jpg 2019583.jpg 2705497.jpg 2078141.jpg 740385.jpg 3105724.jpg 1818014.jpg 593400.jpg 2581276.jpg 3514408.jpg 527199.jpg 910419.jpg 2397868.jpg 2557340.jpg 2621534.jpg 253127.jpg 1810844.jpg 2742044.jpg 2014717.jpg 3342039.jpg 2775763.jpg 302591.jpg 1267359.jpg 2044732.jpg 2331467.jpg 287000.jpg 1671531.jpg 2667824.jpg 998719.jpg 1344966.jpg 2989328.jpg 596494.jpg 3705479.jpg 129536.jpg 3063955.jpg 3337370.jpg 2135635.jpg 1248478.jpg 3767773.jpg 2697971.jpg 2584745.jpg 1899785.jpg 3845083.jpg 920595.jpg 465454.jpg 2602611.jpg 3597955.jpg 3269634.jpg 1173913.jpg 1205154.jpg 2664219.jpg 2456207.jpg 3214153.jpg 2301105.jpg 829229.jpg 920219.jpg 312479.jpg 790841.jpg 2137341.jpg 327415.jpg 2492287.jpg 3148119.jpg 835833.jpg 618348.jpg 2471646.jpg 2164255.jpg 1665654.jpg 430118.jpg 979998.jpg 3628930.jpg 27963.jpg 203831.jpg 1571074.jpg 579691.jpg 926046.jpg 2990186.jpg 93961.jpg 2639094.jpg 1105700.jpg 2439992.jpg 2587918.jpg 1383291.jpg 1370319.jpg 3333459.jpg 1044789.jpg 1075568.jpg 765799.jpg 2365046.jpg 1705773.jpg 1312761.jpg 765000.jpg 1008144.jpg 2605343.jpg 1870865.jpg 277963.jpg 3297714.jpg 2154394.jpg 2217956.jpg 218142.jpg 3860002.jpg 2468499.jpg 959901.jpg 3324050.jpg 3018077.jpg 1572608.jpg 807128.jpg 1076699.jpg 2005870.jpg 163039.jpg 220190.jpg 2693334.jpg 332231.jpg 269396.jpg 140031.jpg 1774438.jpg 401701.jpg 54540.jpg 1987634.jpg 2587921.jpg 2493954.jpg 1980167.jpg 2426686.jpg 2831983.jpg 2707814.jpg 3427699.jpg 874288.jpg 1245628.jpg 3207504.jpg 1011404.jpg 1234172.jpg 1040878.jpg 972000.jpg 2508157.jpg 1137400.jpg 947246.jpg 786995.jpg 1209973.jpg 232976.jpg 3479875.jpg 3766053.jpg 32004.jpg 2428085.jpg 3821701.jpg 1384464.jpg 2702825.jpg 2990023.jpg 3745884.jpg 413710.jpg 1638227.jpg 771878.jpg 68684.jpg 1670471.jpg 3882444.jpg 2019441.jpg 2361973.jpg 518527.jpg 3595758.jpg 823104.jpg 704161.jpg 111051.jpg 2155475.jpg 3703769.jpg 1915343.jpg 568995.jpg 2529205.jpg 857888.jpg 98617.jpg 2574896.jpg 2793535.jpg 199019.jpg 1512514.jpg 3443136.jpg 2451169.jpg 464388.jpg 3873326.jpg 2674351.jpg 1544197.jpg 2285942.jpg 1877103.jpg 1778167.jpg 3536393.jpg 221048.jpg 2280345.jpg 799874.jpg 1552253.jpg 1107714.jpg 3713343.jpg 3678290.jpg 12301.jpg 1947572.jpg 2224828.jpg 2670730.jpg 1652943.jpg 141507.jpg 2098014.jpg 714991.jpg 3749515.jpg 394590.jpg 1098197.jpg 307677.jpg 1916846.jpg 656817.jpg 2279642.jpg
There's a lot of images, but how many?
# walk through pizza_steak directory and list number of files
for dirpath, dirnames, filenames in os.walk('pizza_steak'):
print(f"There are {len(dirnames)} directories and {len(filenames)} images in `{dirpath}`")
There are 2 directories and 0 images in `pizza_steak` There are 2 directories and 0 images in `pizza_steak\test` There are 0 directories and 250 images in `pizza_steak\test\steak` There are 0 directories and 250 images in `pizza_steak\test\pizza` There are 2 directories and 0 images in `pizza_steak\train` There are 0 directories and 750 images in `pizza_steak\train\steak` There are 0 directories and 750 images in `pizza_steak\train\pizza`
# another way to find number of images in a folder
num_steak_images_train = len(os.listdir('pizza_steak/train/steak/'))
num_steak_images_train
750
# get list of class names (very useful if dealing with many categories and classes)
import pathlib
import numpy as np
data_dir = pathlib.Path('pizza_steak/train/') # turn our training path into python path
# it turns our folder string, into a path object. A flexible way to work with file paths in python
class_names = np.array(sorted([item.name for item in data_dir.glob('*')])) # created a list of class_names from the subdirectories
# data_dir.glob > gets all items in the directory
# item.name for item > extracts only the name of each item
# sorted(...) > sorts name by alphabetical order
# np.array(...) > converts sorted list to Numpy array for efficient processing
class_names
array(['pizza', 'steak'], dtype='<U5')
Based on above info, we've got 750 training images, and 250 test images of either steak or pizza.
Now let's visualize one of the images
# view an image
import matplotlib.pyplot as plt
import matplotlib.pyplot as mpimg
import random
def view_random_image(target_dir, target_class):
# setup target directory
target_folder = target_dir+target_class
# get a random image path
random_image = random.sample(os.listdir(target_folder), 1) # the 1, states to only pick `one` image
# read in the image and plot it using matplotlib
img = mpimg.imread(target_folder + '/' + random_image[0]) # [0], as img gets a list, we can't append it to directory unless it is a string. So you go into the list and do `[0]` for the first and only result
plt.imshow(img)
plt.title(target_class)
plt.axis('off');
print(f"Image shape: {img.shape}")
return img
# view a random image from the training dataset
img = view_random_image(target_dir='pizza_steak/train/',
target_class='steak')
Image shape: (512, 512, 3)
You might notice we've printed the image shape with the actual image. This is due to how the computer sees the image, through a big array (tensor).
# how the computer views img
img
array([[[226, 156, 44],
[220, 150, 38],
[207, 137, 23],
...,
[219, 170, 173],
[212, 163, 166],
[220, 171, 174]],
[[223, 155, 46],
[219, 151, 40],
[212, 144, 33],
...,
[209, 160, 163],
[208, 159, 162],
[212, 163, 166]],
[[211, 146, 42],
[209, 145, 39],
[211, 147, 41],
...,
[207, 159, 159],
[209, 161, 161],
[210, 162, 162]],
...,
[[224, 192, 171],
[227, 195, 174],
[228, 196, 175],
...,
[233, 207, 190],
[236, 211, 191],
[242, 217, 197]],
[[222, 192, 168],
[226, 196, 172],
[228, 197, 176],
...,
[234, 208, 191],
[238, 213, 193],
[244, 219, 199]],
[[225, 195, 171],
[229, 199, 175],
[229, 198, 177],
...,
[230, 204, 187],
[232, 207, 187],
[234, 209, 189]]], dtype=uint8)
img.shape # image shape, returning (width, height, colour channel)
(512, 512, 3)
You see our image shape as a 3d array of width, height, colour channel. Width and height of image in our dataset may vary, though we will always have 3 colour channels, representing red, blue, and green.
Our img array also contains value between 0 to 255, as that's the possible range for red, blue, green, and also grayscale images as well.
So when building a model to differentiate pizza and steak, it will find patterns in these different pixel values to determine which class is which.
Note: Values are between 0 and 255, but ML would much prefer values between 0 and 1, so we'll need to scale/normalize it, by dividing it with 255.
# get all values within 0 to 1
img/255.0
array([[[0.88627451, 0.61176471, 0.17254902],
[0.8627451 , 0.58823529, 0.14901961],
[0.81176471, 0.5372549 , 0.09019608],
...,
[0.85882353, 0.66666667, 0.67843137],
[0.83137255, 0.63921569, 0.65098039],
[0.8627451 , 0.67058824, 0.68235294]],
[[0.8745098 , 0.60784314, 0.18039216],
[0.85882353, 0.59215686, 0.15686275],
[0.83137255, 0.56470588, 0.12941176],
...,
[0.81960784, 0.62745098, 0.63921569],
[0.81568627, 0.62352941, 0.63529412],
[0.83137255, 0.63921569, 0.65098039]],
[[0.82745098, 0.57254902, 0.16470588],
[0.81960784, 0.56862745, 0.15294118],
[0.82745098, 0.57647059, 0.16078431],
...,
[0.81176471, 0.62352941, 0.62352941],
[0.81960784, 0.63137255, 0.63137255],
[0.82352941, 0.63529412, 0.63529412]],
...,
[[0.87843137, 0.75294118, 0.67058824],
[0.89019608, 0.76470588, 0.68235294],
[0.89411765, 0.76862745, 0.68627451],
...,
[0.91372549, 0.81176471, 0.74509804],
[0.9254902 , 0.82745098, 0.74901961],
[0.94901961, 0.85098039, 0.77254902]],
[[0.87058824, 0.75294118, 0.65882353],
[0.88627451, 0.76862745, 0.6745098 ],
[0.89411765, 0.77254902, 0.69019608],
...,
[0.91764706, 0.81568627, 0.74901961],
[0.93333333, 0.83529412, 0.75686275],
[0.95686275, 0.85882353, 0.78039216]],
[[0.88235294, 0.76470588, 0.67058824],
[0.89803922, 0.78039216, 0.68627451],
[0.89803922, 0.77647059, 0.69411765],
...,
[0.90196078, 0.8 , 0.73333333],
[0.90980392, 0.81176471, 0.73333333],
[0.91764706, 0.81960784, 0.74117647]]])
A (typical) architecture of a convolutional neural network¶
CNN are no different than other deep learning neural networks, besides being able to be created in many different ways. Here's a list of components in a network:
| Hyperparameter/layer type | What it does? | Typical values |
|---|---|---|
| Input image(s) | Target images you want to discover patterns of | Any type of photo (or video) |
| Input layer💙 | Takes target image, and preprocesses them for further layers | input_shape = [batch_size, img_height, img_width, channels] |
| Convolutional layer💧 | Extracts/learns most important features in image | Multiple, can create with tf.keras.layers.ConvXD (X can be multiple values) |
| Hidden activation💚 | Adds non-linearity to learned features (non-straight lines) | Usually ReLU (tf.keras.activations.relu) |
| Pooling layer💛 | Reduces dimensionality/size of learned image features | Average (tf.keras.layers.AvgPool2D) or Max(`tf.keras.layers.MaxPool2D) |
| Fully connected layer🧡 | Further refines learned features from convolution layers | tf.keras.layers.Dense |
| Output layer🧡 | Takes learned features and outputs them in shape of target labels | output_shape = [number_of_classes] (e.g. 3 for pizza, steak or sushi) |
| Output Activation🖤 | Adds non-linearities to output layer | tf.keras.activations.sigmoid (binary classification) or tf.keras.activations.softmax |
A typical CNN model, colour coded to its respective layer type

A simple example of how you might stack together the above layers into a convolutional neural network. Note the convolutional and pooling layers can often be arranged and rearranged into many different formations
An end-to-end example¶
We've seen how there's 750 training, and 250 testing images of either class. It's time to jump to the deep end.
There's an original dataset author papers, where they stated their use of Random Forest machine learning models, averaging an accuracy of 50.76% when predicting different foods.
50.76% Will be our baseline to achieve.
Note: A baseline is a score/evaluaion metric you want to beat. Usually you start with a simple model, create a baseline based on that, and beat it through increasing complexity of model. A fun way to get your baseline is through some modelling paper with publiched results.
The following code underneath, replicates an end-to-end model for pizza_steak dataset, inclusive of CNN as it uses the components that has been explained above. We'll go through each steps later in the notebook.
The model used replicates TinyVGG. The computer vision architecture which fuels the CNN explainer webpage.
Resource: Architecture we're using is a scaled down version of
VGG-16
!pip install tensorflow
Collecting tensorflow
Downloading tensorflow-2.19.0-cp310-cp310-win_amd64.whl.metadata (4.1 kB)
Collecting absl-py>=1.0.0 (from tensorflow)
Using cached absl_py-2.2.2-py3-none-any.whl.metadata (2.6 kB)
Collecting astunparse>=1.6.0 (from tensorflow)
Using cached astunparse-1.6.3-py2.py3-none-any.whl.metadata (4.4 kB)
Collecting flatbuffers>=24.3.25 (from tensorflow)
Using cached flatbuffers-25.2.10-py2.py3-none-any.whl.metadata (875 bytes)
Collecting gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 (from tensorflow)
Using cached gast-0.6.0-py3-none-any.whl.metadata (1.3 kB)
Collecting google-pasta>=0.1.1 (from tensorflow)
Using cached google_pasta-0.2.0-py3-none-any.whl.metadata (814 bytes)
Collecting libclang>=13.0.0 (from tensorflow)
Using cached libclang-18.1.1-py2.py3-none-win_amd64.whl.metadata (5.3 kB)
Collecting opt-einsum>=2.3.2 (from tensorflow)
Using cached opt_einsum-3.4.0-py3-none-any.whl.metadata (6.3 kB)
Requirement already satisfied: packaging in x:\anaconda3\envs\tf_gpu\lib\site-packages (from tensorflow) (25.0)
Collecting protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<6.0.0dev,>=3.20.3 (from tensorflow)
Downloading protobuf-5.29.4-cp310-abi3-win_amd64.whl.metadata (592 bytes)
Collecting requests<3,>=2.21.0 (from tensorflow)
Using cached requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Requirement already satisfied: setuptools in x:\anaconda3\envs\tf_gpu\lib\site-packages (from tensorflow) (78.1.1)
Requirement already satisfied: six>=1.12.0 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from tensorflow) (1.17.0)
Collecting termcolor>=1.1.0 (from tensorflow)
Using cached termcolor-3.1.0-py3-none-any.whl.metadata (6.4 kB)
Requirement already satisfied: typing-extensions>=3.6.6 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from tensorflow) (4.13.2)
Collecting wrapt>=1.11.0 (from tensorflow)
Downloading wrapt-1.17.2-cp310-cp310-win_amd64.whl.metadata (6.5 kB)
Collecting grpcio<2.0,>=1.24.3 (from tensorflow)
Downloading grpcio-1.71.0-cp310-cp310-win_amd64.whl.metadata (4.0 kB)
Collecting tensorboard~=2.19.0 (from tensorflow)
Using cached tensorboard-2.19.0-py3-none-any.whl.metadata (1.8 kB)
Collecting keras>=3.5.0 (from tensorflow)
Using cached keras-3.10.0-py3-none-any.whl.metadata (6.0 kB)
Collecting numpy<2.2.0,>=1.26.0 (from tensorflow)
Downloading numpy-2.1.3-cp310-cp310-win_amd64.whl.metadata (60 kB)
Collecting h5py>=3.11.0 (from tensorflow)
Downloading h5py-3.13.0-cp310-cp310-win_amd64.whl.metadata (2.5 kB)
Collecting ml-dtypes<1.0.0,>=0.5.1 (from tensorflow)
Downloading ml_dtypes-0.5.1-cp310-cp310-win_amd64.whl.metadata (22 kB)
Collecting tensorflow-io-gcs-filesystem>=0.23.1 (from tensorflow)
Downloading tensorflow_io_gcs_filesystem-0.31.0-cp310-cp310-win_amd64.whl.metadata (14 kB)
Collecting charset-normalizer<4,>=2 (from requests<3,>=2.21.0->tensorflow)
Downloading charset_normalizer-3.4.2-cp310-cp310-win_amd64.whl.metadata (36 kB)
Collecting idna<4,>=2.5 (from requests<3,>=2.21.0->tensorflow)
Using cached idna-3.10-py3-none-any.whl.metadata (10 kB)
Collecting urllib3<3,>=1.21.1 (from requests<3,>=2.21.0->tensorflow)
Downloading urllib3-2.4.0-py3-none-any.whl.metadata (6.5 kB)
Collecting certifi>=2017.4.17 (from requests<3,>=2.21.0->tensorflow)
Downloading certifi-2025.4.26-py3-none-any.whl.metadata (2.5 kB)
Collecting markdown>=2.6.8 (from tensorboard~=2.19.0->tensorflow)
Downloading markdown-3.8-py3-none-any.whl.metadata (5.1 kB)
Collecting tensorboard-data-server<0.8.0,>=0.7.0 (from tensorboard~=2.19.0->tensorflow)
Using cached tensorboard_data_server-0.7.2-py3-none-any.whl.metadata (1.1 kB)
Collecting werkzeug>=1.0.1 (from tensorboard~=2.19.0->tensorflow)
Using cached werkzeug-3.1.3-py3-none-any.whl.metadata (3.7 kB)
Requirement already satisfied: wheel<1.0,>=0.23.0 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from astunparse>=1.6.0->tensorflow) (0.45.1)
Collecting rich (from keras>=3.5.0->tensorflow)
Downloading rich-14.0.0-py3-none-any.whl.metadata (18 kB)
Collecting namex (from keras>=3.5.0->tensorflow)
Using cached namex-0.0.9-py3-none-any.whl.metadata (322 bytes)
Collecting optree (from keras>=3.5.0->tensorflow)
Downloading optree-0.15.0-cp310-cp310-win_amd64.whl.metadata (49 kB)
Collecting MarkupSafe>=2.1.1 (from werkzeug>=1.0.1->tensorboard~=2.19.0->tensorflow)
Downloading MarkupSafe-3.0.2-cp310-cp310-win_amd64.whl.metadata (4.1 kB)
Collecting markdown-it-py>=2.2.0 (from rich->keras>=3.5.0->tensorflow)
Using cached markdown_it_py-3.0.0-py3-none-any.whl.metadata (6.9 kB)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from rich->keras>=3.5.0->tensorflow) (2.19.1)
Collecting mdurl~=0.1 (from markdown-it-py>=2.2.0->rich->keras>=3.5.0->tensorflow)
Using cached mdurl-0.1.2-py3-none-any.whl.metadata (1.6 kB)
Downloading tensorflow-2.19.0-cp310-cp310-win_amd64.whl (375.7 MB)
---------------------------------------- 0.0/375.7 MB ? eta -:--:--
---------------------------------------- 2.4/375.7 MB 11.2 MB/s eta 0:00:34
--------------------------------------- 5.0/375.7 MB 12.1 MB/s eta 0:00:31
--------------------------------------- 7.6/375.7 MB 12.7 MB/s eta 0:00:29
- -------------------------------------- 9.7/375.7 MB 11.6 MB/s eta 0:00:32
- -------------------------------------- 12.6/375.7 MB 11.8 MB/s eta 0:00:31
- -------------------------------------- 15.2/375.7 MB 12.1 MB/s eta 0:00:30
- -------------------------------------- 17.8/375.7 MB 12.1 MB/s eta 0:00:30
-- ------------------------------------- 20.2/375.7 MB 11.9 MB/s eta 0:00:30
-- ------------------------------------- 23.3/375.7 MB 12.2 MB/s eta 0:00:29
-- ------------------------------------- 26.0/375.7 MB 12.3 MB/s eta 0:00:29
--- ------------------------------------ 28.6/375.7 MB 12.2 MB/s eta 0:00:29
--- ------------------------------------ 31.5/375.7 MB 12.3 MB/s eta 0:00:28
--- ------------------------------------ 33.8/375.7 MB 12.3 MB/s eta 0:00:28
--- ------------------------------------ 36.4/375.7 MB 12.3 MB/s eta 0:00:28
---- ----------------------------------- 38.5/375.7 MB 12.2 MB/s eta 0:00:28
---- ----------------------------------- 41.4/375.7 MB 12.1 MB/s eta 0:00:28
---- ----------------------------------- 44.3/375.7 MB 12.3 MB/s eta 0:00:28
---- ----------------------------------- 46.7/375.7 MB 12.3 MB/s eta 0:00:27
----- ---------------------------------- 49.3/375.7 MB 12.2 MB/s eta 0:00:27
----- ---------------------------------- 52.4/375.7 MB 12.3 MB/s eta 0:00:27
----- ---------------------------------- 55.1/375.7 MB 12.3 MB/s eta 0:00:26
------ --------------------------------- 57.1/375.7 MB 12.2 MB/s eta 0:00:27
------ --------------------------------- 59.2/375.7 MB 12.1 MB/s eta 0:00:27
------ --------------------------------- 62.1/375.7 MB 12.2 MB/s eta 0:00:26
------ --------------------------------- 65.0/375.7 MB 12.3 MB/s eta 0:00:26
------- -------------------------------- 67.4/375.7 MB 12.2 MB/s eta 0:00:26
------- -------------------------------- 70.0/375.7 MB 12.2 MB/s eta 0:00:26
------- -------------------------------- 71.6/375.7 MB 12.1 MB/s eta 0:00:26
------- -------------------------------- 74.2/375.7 MB 12.0 MB/s eta 0:00:26
-------- ------------------------------- 76.8/375.7 MB 12.1 MB/s eta 0:00:25
-------- ------------------------------- 79.4/375.7 MB 12.1 MB/s eta 0:00:25
-------- ------------------------------- 82.3/375.7 MB 12.1 MB/s eta 0:00:25
--------- ------------------------------ 85.2/375.7 MB 12.2 MB/s eta 0:00:24
--------- ------------------------------ 87.6/375.7 MB 12.1 MB/s eta 0:00:24
--------- ------------------------------ 90.7/375.7 MB 12.2 MB/s eta 0:00:24
--------- ------------------------------ 93.3/375.7 MB 12.2 MB/s eta 0:00:24
---------- ----------------------------- 96.2/375.7 MB 12.2 MB/s eta 0:00:23
---------- ----------------------------- 98.3/375.7 MB 12.2 MB/s eta 0:00:23
---------- ---------------------------- 100.9/375.7 MB 12.2 MB/s eta 0:00:23
---------- ---------------------------- 103.8/375.7 MB 12.2 MB/s eta 0:00:23
----------- --------------------------- 106.7/375.7 MB 12.2 MB/s eta 0:00:23
----------- --------------------------- 109.3/375.7 MB 12.2 MB/s eta 0:00:22
----------- --------------------------- 111.9/375.7 MB 12.3 MB/s eta 0:00:22
----------- --------------------------- 114.8/375.7 MB 12.3 MB/s eta 0:00:22
------------ -------------------------- 117.4/375.7 MB 12.3 MB/s eta 0:00:22
------------ -------------------------- 120.6/375.7 MB 12.3 MB/s eta 0:00:21
------------ -------------------------- 123.2/375.7 MB 12.3 MB/s eta 0:00:21
------------- ------------------------- 125.8/375.7 MB 12.3 MB/s eta 0:00:21
------------- ------------------------- 128.5/375.7 MB 12.3 MB/s eta 0:00:21
------------- ------------------------- 131.1/375.7 MB 12.3 MB/s eta 0:00:20
------------- ------------------------- 134.0/375.7 MB 12.4 MB/s eta 0:00:20
-------------- ------------------------ 136.8/375.7 MB 12.4 MB/s eta 0:00:20
-------------- ------------------------ 139.7/375.7 MB 12.4 MB/s eta 0:00:20
-------------- ------------------------ 142.6/375.7 MB 12.4 MB/s eta 0:00:19
--------------- ----------------------- 145.2/375.7 MB 12.5 MB/s eta 0:00:19
--------------- ----------------------- 148.4/375.7 MB 12.5 MB/s eta 0:00:19
--------------- ----------------------- 151.0/375.7 MB 12.5 MB/s eta 0:00:19
--------------- ----------------------- 153.9/375.7 MB 12.5 MB/s eta 0:00:18
---------------- ---------------------- 156.5/375.7 MB 12.5 MB/s eta 0:00:18
---------------- ---------------------- 159.4/375.7 MB 12.5 MB/s eta 0:00:18
---------------- ---------------------- 162.0/375.7 MB 12.5 MB/s eta 0:00:18
----------------- --------------------- 164.9/375.7 MB 12.5 MB/s eta 0:00:17
----------------- --------------------- 167.2/375.7 MB 12.5 MB/s eta 0:00:17
----------------- --------------------- 170.1/375.7 MB 12.5 MB/s eta 0:00:17
----------------- --------------------- 173.0/375.7 MB 12.5 MB/s eta 0:00:17
------------------ -------------------- 175.1/375.7 MB 12.5 MB/s eta 0:00:17
------------------ -------------------- 178.3/375.7 MB 12.5 MB/s eta 0:00:16
------------------ -------------------- 181.1/375.7 MB 12.5 MB/s eta 0:00:16
------------------- ------------------- 183.8/375.7 MB 12.5 MB/s eta 0:00:16
------------------- ------------------- 185.9/375.7 MB 12.5 MB/s eta 0:00:16
------------------- ------------------- 188.2/375.7 MB 12.5 MB/s eta 0:00:16
------------------- ------------------- 190.8/375.7 MB 12.5 MB/s eta 0:00:15
-------------------- ------------------ 193.7/375.7 MB 12.5 MB/s eta 0:00:15
-------------------- ------------------ 196.3/375.7 MB 12.5 MB/s eta 0:00:15
-------------------- ------------------ 199.0/375.7 MB 12.5 MB/s eta 0:00:15
-------------------- ------------------ 201.9/375.7 MB 12.5 MB/s eta 0:00:14
--------------------- ----------------- 204.5/375.7 MB 12.5 MB/s eta 0:00:14
--------------------- ----------------- 207.1/375.7 MB 12.5 MB/s eta 0:00:14
--------------------- ----------------- 209.5/375.7 MB 12.5 MB/s eta 0:00:14
---------------------- ---------------- 212.3/375.7 MB 12.5 MB/s eta 0:00:14
---------------------- ---------------- 215.0/375.7 MB 12.5 MB/s eta 0:00:13
---------------------- ---------------- 217.8/375.7 MB 12.5 MB/s eta 0:00:13
---------------------- ---------------- 220.7/375.7 MB 12.5 MB/s eta 0:00:13
----------------------- --------------- 223.9/375.7 MB 12.6 MB/s eta 0:00:13
----------------------- --------------- 226.5/375.7 MB 12.6 MB/s eta 0:00:12
----------------------- --------------- 229.4/375.7 MB 12.6 MB/s eta 0:00:12
------------------------ -------------- 232.5/375.7 MB 12.6 MB/s eta 0:00:12
------------------------ -------------- 235.4/375.7 MB 12.6 MB/s eta 0:00:12
------------------------ -------------- 238.0/375.7 MB 12.6 MB/s eta 0:00:11
------------------------ -------------- 240.6/375.7 MB 12.6 MB/s eta 0:00:11
------------------------- ------------- 243.8/375.7 MB 12.6 MB/s eta 0:00:11
------------------------- ------------- 246.4/375.7 MB 12.6 MB/s eta 0:00:11
------------------------- ------------- 248.8/375.7 MB 12.6 MB/s eta 0:00:11
-------------------------- ------------ 251.7/375.7 MB 12.6 MB/s eta 0:00:10
-------------------------- ------------ 254.5/375.7 MB 12.6 MB/s eta 0:00:10
-------------------------- ------------ 257.4/375.7 MB 12.6 MB/s eta 0:00:10
-------------------------- ------------ 260.0/375.7 MB 12.6 MB/s eta 0:00:10
--------------------------- ----------- 263.2/375.7 MB 12.6 MB/s eta 0:00:09
--------------------------- ----------- 265.8/375.7 MB 12.6 MB/s eta 0:00:09
--------------------------- ----------- 268.4/375.7 MB 12.6 MB/s eta 0:00:09
---------------------------- ---------- 271.3/375.7 MB 12.7 MB/s eta 0:00:09
---------------------------- ---------- 274.2/375.7 MB 12.7 MB/s eta 0:00:09
---------------------------- ---------- 276.6/375.7 MB 12.7 MB/s eta 0:00:08
----------------------------- --------- 279.4/375.7 MB 12.7 MB/s eta 0:00:08
----------------------------- --------- 282.3/375.7 MB 12.7 MB/s eta 0:00:08
----------------------------- --------- 285.2/375.7 MB 12.7 MB/s eta 0:00:08
----------------------------- --------- 287.8/375.7 MB 12.7 MB/s eta 0:00:07
------------------------------ -------- 291.0/375.7 MB 12.7 MB/s eta 0:00:07
------------------------------ -------- 293.6/375.7 MB 12.7 MB/s eta 0:00:07
------------------------------ -------- 296.2/375.7 MB 12.7 MB/s eta 0:00:07
------------------------------- ------- 298.8/375.7 MB 12.7 MB/s eta 0:00:07
------------------------------- ------- 302.0/375.7 MB 12.8 MB/s eta 0:00:06
------------------------------- ------- 304.6/375.7 MB 12.8 MB/s eta 0:00:06
------------------------------- ------- 307.5/375.7 MB 12.8 MB/s eta 0:00:06
-------------------------------- ------ 310.1/375.7 MB 12.8 MB/s eta 0:00:06
-------------------------------- ------ 313.0/375.7 MB 12.8 MB/s eta 0:00:05
-------------------------------- ------ 315.9/375.7 MB 12.8 MB/s eta 0:00:05
--------------------------------- ----- 318.8/375.7 MB 12.8 MB/s eta 0:00:05
--------------------------------- ----- 320.9/375.7 MB 12.8 MB/s eta 0:00:05
--------------------------------- ----- 323.5/375.7 MB 12.8 MB/s eta 0:00:05
--------------------------------- ----- 326.4/375.7 MB 12.8 MB/s eta 0:00:04
---------------------------------- ---- 329.0/375.7 MB 12.8 MB/s eta 0:00:04
---------------------------------- ---- 331.4/375.7 MB 12.8 MB/s eta 0:00:04
---------------------------------- ---- 334.2/375.7 MB 12.9 MB/s eta 0:00:04
---------------------------------- ---- 336.9/375.7 MB 12.9 MB/s eta 0:00:04
----------------------------------- --- 339.5/375.7 MB 12.9 MB/s eta 0:00:03
----------------------------------- --- 341.8/375.7 MB 12.9 MB/s eta 0:00:03
----------------------------------- --- 344.7/375.7 MB 12.9 MB/s eta 0:00:03
------------------------------------ -- 347.9/375.7 MB 12.9 MB/s eta 0:00:03
------------------------------------ -- 351.0/375.7 MB 12.9 MB/s eta 0:00:02
------------------------------------ -- 353.9/375.7 MB 12.9 MB/s eta 0:00:02
------------------------------------ -- 356.0/375.7 MB 12.9 MB/s eta 0:00:02
------------------------------------- - 358.6/375.7 MB 12.9 MB/s eta 0:00:02
------------------------------------- - 358.6/375.7 MB 12.9 MB/s eta 0:00:02
------------------------------------- - 363.3/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 366.2/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 369.1/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 372.0/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 373.0/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
-------------------------------------- 375.7/375.7 MB 12.9 MB/s eta 0:00:01
--------------------------------------- 375.7/375.7 MB 10.0 MB/s eta 0:00:00
Downloading grpcio-1.71.0-cp310-cp310-win_amd64.whl (4.3 MB)
---------------------------------------- 0.0/4.3 MB ? eta -:--:--
-------------------------- ------------- 2.9/4.3 MB 14.0 MB/s eta 0:00:01
---------------------------------------- 4.3/4.3 MB 12.9 MB/s eta 0:00:00
Downloading ml_dtypes-0.5.1-cp310-cp310-win_amd64.whl (209 kB)
Downloading numpy-2.1.3-cp310-cp310-win_amd64.whl (12.9 MB)
---------------------------------------- 0.0/12.9 MB ? eta -:--:--
------- -------------------------------- 2.4/12.9 MB 12.2 MB/s eta 0:00:01
--------------- ------------------------ 5.0/12.9 MB 12.6 MB/s eta 0:00:01
------------------------ --------------- 7.9/12.9 MB 12.8 MB/s eta 0:00:01
-------------------------------- ------- 10.5/12.9 MB 13.1 MB/s eta 0:00:01
---------------------------------------- 12.9/12.9 MB 12.4 MB/s eta 0:00:00
Downloading protobuf-5.29.4-cp310-abi3-win_amd64.whl (434 kB)
Using cached requests-2.32.3-py3-none-any.whl (64 kB)
Downloading charset_normalizer-3.4.2-cp310-cp310-win_amd64.whl (105 kB)
Using cached idna-3.10-py3-none-any.whl (70 kB)
Using cached tensorboard-2.19.0-py3-none-any.whl (5.5 MB)
Using cached tensorboard_data_server-0.7.2-py3-none-any.whl (2.4 kB)
Downloading urllib3-2.4.0-py3-none-any.whl (128 kB)
Using cached absl_py-2.2.2-py3-none-any.whl (135 kB)
Using cached astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Downloading certifi-2025.4.26-py3-none-any.whl (159 kB)
Using cached flatbuffers-25.2.10-py2.py3-none-any.whl (30 kB)
Using cached gast-0.6.0-py3-none-any.whl (21 kB)
Using cached google_pasta-0.2.0-py3-none-any.whl (57 kB)
Downloading h5py-3.13.0-cp310-cp310-win_amd64.whl (3.0 MB)
---------------------------------------- 0.0/3.0 MB ? eta -:--:--
--------------------------------------- 2.9/3.0 MB 12.9 MB/s eta 0:00:01
---------------------------------------- 3.0/3.0 MB 12.3 MB/s eta 0:00:00
Using cached keras-3.10.0-py3-none-any.whl (1.4 MB)
Using cached libclang-18.1.1-py2.py3-none-win_amd64.whl (26.4 MB)
Downloading markdown-3.8-py3-none-any.whl (106 kB)
Using cached opt_einsum-3.4.0-py3-none-any.whl (71 kB)
Downloading tensorflow_io_gcs_filesystem-0.31.0-cp310-cp310-win_amd64.whl (1.5 MB)
---------------------------------------- 0.0/1.5 MB ? eta -:--:--
---------------------------------------- 1.5/1.5 MB 11.2 MB/s eta 0:00:00
Using cached termcolor-3.1.0-py3-none-any.whl (7.7 kB)
Using cached werkzeug-3.1.3-py3-none-any.whl (224 kB)
Downloading MarkupSafe-3.0.2-cp310-cp310-win_amd64.whl (15 kB)
Downloading wrapt-1.17.2-cp310-cp310-win_amd64.whl (38 kB)
Using cached namex-0.0.9-py3-none-any.whl (5.8 kB)
Downloading optree-0.15.0-cp310-cp310-win_amd64.whl (297 kB)
Downloading rich-14.0.0-py3-none-any.whl (243 kB)
Using cached markdown_it_py-3.0.0-py3-none-any.whl (87 kB)
Using cached mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Installing collected packages: namex, libclang, flatbuffers, wrapt, urllib3, termcolor, tensorflow-io-gcs-filesystem, tensorboard-data-server, protobuf, optree, opt-einsum, numpy, mdurl, MarkupSafe, markdown, idna, grpcio, google-pasta, gast, charset-normalizer, certifi, astunparse, absl-py, werkzeug, requests, ml-dtypes, markdown-it-py, h5py, tensorboard, rich, keras, tensorflow
- -------------------------------------- 1/32 [libclang]
- -------------------------------------- 1/32 [libclang]
- -------------------------------------- 1/32 [libclang]
--- ------------------------------------ 3/32 [wrapt]
----- ---------------------------------- 4/32 [urllib3]
----- ---------------------------------- 4/32 [urllib3]
----- ---------------------------------- 4/32 [urllib3]
----- ---------------------------------- 4/32 [urllib3]
------- ------------------------------- 6/32 [tensorflow-io-gcs-filesystem]
---------- ----------------------------- 8/32 [protobuf]
---------- ----------------------------- 8/32 [protobuf]
---------- ----------------------------- 8/32 [protobuf]
---------- ----------------------------- 8/32 [protobuf]
---------- ----------------------------- 8/32 [protobuf]
---------- ----------------------------- 8/32 [protobuf]
---------- ----------------------------- 8/32 [protobuf]
----------- ---------------------------- 9/32 [optree]
----------- ---------------------------- 9/32 [optree]
------------ --------------------------- 10/32 [opt-einsum]
------------ --------------------------- 10/32 [opt-einsum]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
------------- -------------------------- 11/32 [numpy]
--------------- ------------------------ 12/32 [mdurl]
----------------- ---------------------- 14/32 [markdown]
----------------- ---------------------- 14/32 [markdown]
----------------- ---------------------- 14/32 [markdown]
----------------- ---------------------- 14/32 [markdown]
------------------ --------------------- 15/32 [idna]
-------------------- ------------------- 16/32 [grpcio]
-------------------- ------------------- 16/32 [grpcio]
-------------------- ------------------- 16/32 [grpcio]
-------------------- ------------------- 16/32 [grpcio]
-------------------- ------------------- 16/32 [grpcio]
--------------------- ------------------ 17/32 [google-pasta]
--------------------- ------------------ 17/32 [google-pasta]
---------------------- ----------------- 18/32 [gast]
----------------------- ---------------- 19/32 [charset-normalizer]
----------------------- ---------------- 19/32 [charset-normalizer]
--------------------------- ------------ 22/32 [absl-py]
--------------------------- ------------ 22/32 [absl-py]
--------------------------- ------------ 22/32 [absl-py]
---------------------------- ----------- 23/32 [werkzeug]
---------------------------- ----------- 23/32 [werkzeug]
---------------------------- ----------- 23/32 [werkzeug]
---------------------------- ----------- 23/32 [werkzeug]
---------------------------- ----------- 23/32 [werkzeug]
---------------------------- ----------- 23/32 [werkzeug]
------------------------------ --------- 24/32 [requests]
------------------------------ --------- 24/32 [requests]
-------------------------------- ------- 26/32 [markdown-it-py]
-------------------------------- ------- 26/32 [markdown-it-py]
-------------------------------- ------- 26/32 [markdown-it-py]
-------------------------------- ------- 26/32 [markdown-it-py]
-------------------------------- ------- 26/32 [markdown-it-py]
-------------------------------- ------- 26/32 [markdown-it-py]
-------------------------------- ------- 26/32 [markdown-it-py]
--------------------------------- ------ 27/32 [h5py]
--------------------------------- ------ 27/32 [h5py]
--------------------------------- ------ 27/32 [h5py]
--------------------------------- ------ 27/32 [h5py]
--------------------------------- ------ 27/32 [h5py]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
----------------------------------- ---- 28/32 [tensorboard]
------------------------------------ --- 29/32 [rich]
------------------------------------ --- 29/32 [rich]
------------------------------------ --- 29/32 [rich]
------------------------------------ --- 29/32 [rich]
------------------------------------ --- 29/32 [rich]
------------------------------------ --- 29/32 [rich]
------------------------------------ --- 29/32 [rich]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
------------------------------------- -- 30/32 [keras]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
-------------------------------------- - 31/32 [tensorflow]
---------------------------------------- 32/32 [tensorflow]
Successfully installed MarkupSafe-3.0.2 absl-py-2.2.2 astunparse-1.6.3 certifi-2025.4.26 charset-normalizer-3.4.2 flatbuffers-25.2.10 gast-0.6.0 google-pasta-0.2.0 grpcio-1.71.0 h5py-3.13.0 idna-3.10 keras-3.10.0 libclang-18.1.1 markdown-3.8 markdown-it-py-3.0.0 mdurl-0.1.2 ml-dtypes-0.5.1 namex-0.0.9 numpy-2.1.3 opt-einsum-3.4.0 optree-0.15.0 protobuf-5.29.4 requests-2.32.3 rich-14.0.0 tensorboard-2.19.0 tensorboard-data-server-0.7.2 tensorflow-2.19.0 tensorflow-io-gcs-filesystem-0.31.0 termcolor-3.1.0 urllib3-2.4.0 werkzeug-3.1.3 wrapt-1.17.2
!pip install Pillow
Collecting Pillow Downloading pillow-11.2.1-cp310-cp310-win_amd64.whl.metadata (9.1 kB) Downloading pillow-11.2.1-cp310-cp310-win_amd64.whl (2.7 MB) ---------------------------------------- 0.0/2.7 MB ? eta -:--:-- ----------------------------------- ---- 2.4/2.7 MB 12.3 MB/s eta 0:00:01 ---------------------------------------- 2.7/2.7 MB 11.9 MB/s eta 0:00:00 Installing collected packages: Pillow Successfully installed Pillow-11.2.1
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
# Set the seed
tf.random.set_seed(42)
# Preprocess data (get all of the pixel values between 1 and 0, also called scaling/normalization)
train_datagen = ImageDataGenerator(rescale=1./255)
valid_datagen = ImageDataGenerator(rescale=1./255)
# Setup the train and test directories
train_dir = "pizza_steak/train/"
test_dir = "pizza_steak/test/"
# Import data from directories and turn it into batches
train_data = train_datagen.flow_from_directory(train_dir,
batch_size=32, # number of images to process at a time
target_size=(224, 224), # convert all images to be 224 x 224
class_mode="binary", # type of problem we're working on
seed=42)
valid_data = valid_datagen.flow_from_directory(test_dir,
batch_size=32,
target_size=(224, 224),
class_mode="binary",
seed=42)
# Create a CNN model (same as Tiny VGG - https://poloclub.github.io/cnn-explainer/)
model_1 = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(filters=10,
kernel_size=3, # can also be (3, 3)
activation="relu",
input_shape=(224, 224, 3)), # first layer specifies input shape (height, width, colour channels)
tf.keras.layers.Conv2D(10, 3, activation="relu"),
tf.keras.layers.MaxPool2D(pool_size=2, # pool_size can also be (2, 2)
padding="valid"), # padding can also be 'same'
tf.keras.layers.Conv2D(10, 3, activation="relu"),
tf.keras.layers.Conv2D(10, 3, activation="relu"), # activation='relu' == tf.keras.layers.Activations(tf.nn.relu)
tf.keras.layers.MaxPool2D(2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1, activation="sigmoid") # binary activation output
])
# Compile the model
model_1.compile(loss="binary_crossentropy",
optimizer=tf.keras.optimizers.Adam(),
metrics=["accuracy"])
# Fit the model
history_1 = model_1.fit(train_data,
epochs=5,
steps_per_epoch=len(train_data),
validation_data=valid_data,
validation_steps=len(valid_data))
Found 1500 images belonging to 2 classes. Found 500 images belonging to 2 classes.
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs) x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\trainers\data_adapters\py_dataset_adapter.py:121: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_multiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored. self._warn_if_super_not_called()
Epoch 1/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 27s 551ms/step - accuracy: 0.6127 - loss: 0.6567 - val_accuracy: 0.8200 - val_loss: 0.4221 Epoch 2/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 9s 182ms/step - accuracy: 0.7963 - loss: 0.4422 - val_accuracy: 0.8440 - val_loss: 0.3758 Epoch 3/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 9s 182ms/step - accuracy: 0.7963 - loss: 0.4386 - val_accuracy: 0.8660 - val_loss: 0.3287 Epoch 4/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 9s 184ms/step - accuracy: 0.8481 - loss: 0.3722 - val_accuracy: 0.8580 - val_loss: 0.3748 Epoch 5/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 9s 185ms/step - accuracy: 0.8452 - loss: 0.3756 - val_accuracy: 0.8700 - val_loss: 0.3321
Nice, we've achieved 87% accuracy in validation, way above the 50.78% initial goal! The model is a binary classification model, than 101 different categories. At least it shows the model being able to learn things.
Since we've already fit a model, let's check out its architecture.
# check out the layers in our model
model_1.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d (Conv2D) │ (None, 222, 222, 10) │ 280 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_1 (Conv2D) │ (None, 220, 220, 10) │ 910 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d (MaxPooling2D) │ (None, 110, 110, 10) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_2 (Conv2D) │ (None, 108, 108, 10) │ 910 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_3 (Conv2D) │ (None, 106, 106, 10) │ 910 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_1 (MaxPooling2D) │ (None, 53, 53, 10) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten (Flatten) │ (None, 28090) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (Dense) │ (None, 1) │ 28,091 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 93,305 (364.48 KB)
Trainable params: 31,101 (121.49 KB)
Non-trainable params: 0 (0.00 B)
Optimizer params: 62,204 (242.99 KB)
With the layers in model_1, a lot of them can be understood through this CNN explainer website. Talking about convolutional layers, what they do, their activation functions, pooling, and flatten layers.
However there are a few others that haven't been discussed, namely:
- The
ImageDataGeneratorclass and therescaleparameter - The
flow_from_directory()method- The
batch_sizeparameter - The
target_sizeparameter
- The
Conv2Dlayers (and the parameters which come with them)MaxPool2Dlayers (and their parameters)- The
steps_per_epochandvalidation_stepsparameters in thefit()function
Before learning these, let's try fit a model we've worked on previously to our data.
Using the same model as before¶
To create an example of how neural networks can adapt to different problems, let's check out the binary classification model, and how it might work with our data.
We can use all the same parameters except two things:
- The data - we'll be working with images instead of dots
- The input shape - we need to tell the neural network, the shape of the image it will be working with.
- Common practice is resizing all images to one size. For us, we'll resize to (224, 224, 3), aka an image width/height of 224 pixels, with 3 colour channels (RGB).
# set random seed
tf.random.set_seed(42)
# create a model to replicate the Tensorflow Playground model
model_2 = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(224,224,3)),
tf.keras.layers.Dense(4, activation='relu'),
tf.keras.layers.Dense(4, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
# compile the model
model_2.compile(loss='binary_crossentropy',
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
# fit the model
history_2 = model_2.fit(train_data,
epochs=5,
steps_per_epoch=len(train_data),
validation_data=valid_data,
validation_steps=len(valid_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\reshaping\flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
Epoch 1/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 21s 423ms/step - accuracy: 0.4962 - loss: 0.7066 - val_accuracy: 0.5000 - val_loss: 0.6931 Epoch 2/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 5s 96ms/step - accuracy: 0.5518 - loss: 0.6847 - val_accuracy: 0.5000 - val_loss: 0.6932 Epoch 3/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 4s 94ms/step - accuracy: 0.5259 - loss: 0.6743 - val_accuracy: 0.6380 - val_loss: 0.6128 Epoch 4/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 4s 96ms/step - accuracy: 0.6336 - loss: 0.6272 - val_accuracy: 0.5000 - val_loss: 0.7052 Epoch 5/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 4s 93ms/step - accuracy: 0.4978 - loss: 0.7000 - val_accuracy: 0.5000 - val_loss: 0.6931
Hmm, don't think the model has learnt anything, only reaching 50% accuracy in the training and test set. In binary classification, that's as good as if it guessed.
# check the second model's architecture
model_2.summary()
Model: "sequential_2"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ flatten_2 (Flatten) │ (None, 150528) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_4 (Dense) │ (None, 4) │ 602,116 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_5 (Dense) │ (None, 4) │ 20 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_6 (Dense) │ (None, 1) │ 5 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 1,806,425 (6.89 MB)
Trainable params: 602,141 (2.30 MB)
Non-trainable params: 0 (0.00 B)
Optimizer params: 1,204,284 (4.59 MB)
It's clear that the model has a much larger number of parameters compared to model_1.
2 has 600k+ trainable params, while 1 has 30k+ params, yet outperforms 2.
Note: trainable parameters are basically patterns for a model to learn from the data. You may think more is better, but its not always the case. The difference is the styles of models used. A series of
Denselayers have a number of different learnable parameters thats connected to each other (which increases the number of possible learnable patterns). But aCNNmodel, learn the most important patterns in an image.
Since model_2 didn't work, in what way can we make it work?
Should we increase the number of layers? Maybe the increase of neurons per layer?
More specifically, we'll increase the number of neurons (also called hidden units) in each dense layer from 4 to 100, and add an extra layer.
Note: Adding extra layers/increasing number of neurons isknown as increasing the complexity of your model.
# set random seed
tf.random.set_seed(42)
# create a model similar to model_1 but add an extra layer and increase the number of hidden units in each layer
model_3 = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(224,224,3)), # dense layers expect a 1-dimensional vector as input
tf.keras.layers.Dense(100, activation='relu'), # increase the number of neurons from 4 to 100 (for every layer)
tf.keras.layers.Dense(100, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
# Compile the model
model_3.compile(loss='binary_crossentropy',
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
# Fit the model
history_3 = model_3.fit(train_data,
epochs=5,
steps_per_epoch=len(train_data),
validation_data=valid_data,
validation_steps=len(valid_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\reshaping\flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
Epoch 1/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 10s 188ms/step - accuracy: 0.6247 - loss: 6.6889 - val_accuracy: 0.6380 - val_loss: 1.3323 Epoch 2/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 9s 187ms/step - accuracy: 0.6997 - loss: 0.9636 - val_accuracy: 0.6620 - val_loss: 1.2410 Epoch 3/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 8s 177ms/step - accuracy: 0.7238 - loss: 0.9987 - val_accuracy: 0.6800 - val_loss: 1.0702 Epoch 4/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 8s 172ms/step - accuracy: 0.7212 - loss: 0.7914 - val_accuracy: 0.7520 - val_loss: 0.5467 Epoch 5/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 8s 174ms/step - accuracy: 0.7101 - loss: 0.8614 - val_accuracy: 0.7580 - val_loss: 0.4529
Model is definitely learning something, with 71% accuracy, and 75% accuracy in validation dataset!
Let's check out architecture again
# check out model_3 architecture
model_3.summary()
Model: "sequential_3"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ flatten_3 (Flatten) │ (None, 150528) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_7 (Dense) │ (None, 100) │ 15,052,900 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_8 (Dense) │ (None, 100) │ 10,100 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_9 (Dense) │ (None, 1) │ 101 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 45,189,305 (172.38 MB)
Trainable params: 15,063,101 (57.46 MB)
Non-trainable params: 0 (0.00 B)
Optimizer params: 30,126,204 (114.92 MB)
Despite 15 million parameters, model_1 still outperforms it with it's measly 30k parameters. Goes to show the power of CNNs, and their ability to learn patterns with less parameters.
Binary classification: Let's break it down¶
1. Become one with the data¶
Whatever type of data you're working with, you want to at least visualize 10-100 samples to start to building your own mental model of the data.
For our case, we may notice the steak images tend to have darker colours, and include side dishes with it. Where pizza tends to have a distict circular shape. These are the patterns our neural network may pick up as well.
You can also notice if some of your data is messed up (wrong label), and consider ways to fix it.
!pip install matplotlib
Collecting matplotlib Downloading matplotlib-3.10.3-cp310-cp310-win_amd64.whl.metadata (11 kB) Collecting contourpy>=1.0.1 (from matplotlib) Downloading contourpy-1.3.2-cp310-cp310-win_amd64.whl.metadata (5.5 kB) Collecting cycler>=0.10 (from matplotlib) Using cached cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB) Collecting fonttools>=4.22.0 (from matplotlib) Downloading fonttools-4.58.0-cp310-cp310-win_amd64.whl.metadata (106 kB) Collecting kiwisolver>=1.3.1 (from matplotlib) Downloading kiwisolver-1.4.8-cp310-cp310-win_amd64.whl.metadata (6.3 kB) Requirement already satisfied: numpy>=1.23 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from matplotlib) (2.1.3) Requirement already satisfied: packaging>=20.0 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from matplotlib) (25.0) Requirement already satisfied: pillow>=8 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from matplotlib) (11.2.1) Collecting pyparsing>=2.3.1 (from matplotlib) Downloading pyparsing-3.2.3-py3-none-any.whl.metadata (5.0 kB) Requirement already satisfied: python-dateutil>=2.7 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from matplotlib) (2.9.0.post0) Requirement already satisfied: six>=1.5 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from python-dateutil>=2.7->matplotlib) (1.17.0) Downloading matplotlib-3.10.3-cp310-cp310-win_amd64.whl (8.1 MB) ---------------------------------------- 0.0/8.1 MB ? eta -:--:-- ---------- ----------------------------- 2.1/8.1 MB 11.8 MB/s eta 0:00:01 -------------------------- ------------- 5.2/8.1 MB 12.7 MB/s eta 0:00:01 --------------------------------------- 7.9/8.1 MB 13.2 MB/s eta 0:00:01 ---------------------------------------- 8.1/8.1 MB 12.5 MB/s eta 0:00:00 Downloading contourpy-1.3.2-cp310-cp310-win_amd64.whl (221 kB) Using cached cycler-0.12.1-py3-none-any.whl (8.3 kB) Downloading fonttools-4.58.0-cp310-cp310-win_amd64.whl (2.2 MB) ---------------------------------------- 0.0/2.2 MB ? eta -:--:-- ---------------------------------------- 2.2/2.2 MB 11.5 MB/s eta 0:00:00 Downloading kiwisolver-1.4.8-cp310-cp310-win_amd64.whl (71 kB) Downloading pyparsing-3.2.3-py3-none-any.whl (111 kB) Installing collected packages: pyparsing, kiwisolver, fonttools, cycler, contourpy, matplotlib ---------------------------------------- 0/6 [pyparsing] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] ------------- -------------------------- 2/6 [fonttools] -------------------- ------------------- 3/6 [cycler] -------------------------- ------------- 4/6 [contourpy] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] --------------------------------- ------ 5/6 [matplotlib] ---------------------------------------- 6/6 [matplotlib] Successfully installed contourpy-1.3.2 cycler-0.12.1 fonttools-4.58.0 kiwisolver-1.4.8 matplotlib-3.10.3 pyparsing-3.2.3
# visualize data (requires function 'view_random_image' above)
import matplotlib.pyplot as plt
plt.figure()
plt.subplot(1,2,1)
steak_img = view_random_image('pizza_steak/train/','steak')
plt.subplot(1,2,2)
pizza_img = view_random_image('pizza_steak/train/','pizza')
Image shape: (512, 382, 3) Image shape: (384, 512, 3)
2. Preprocess the data (prepare it for a model)¶
The most important step is creating a training and test set for the model. For us, the data has already been split between training and test. Another option could be to make a validation set, but we won't implement it now.
It's standard to separate train and test in it's own directories/folders, in each of their own classes.
To start, we define the train/test directory paths.
# define training and test directory paths
train_dir = 'pizza_steak/train/'
test_dir = 'pizza_steak/test/'
Next step is to turn data into batches.
A batch is a small subset of the dataset, the model looks at during training. Instead of 10,000 results at a time to figure out the patterns between all of them, they may look at 32 at a time instead.
It does this for a few reasons:
- 10,000 images or more, may not fit into memory of the GPU
- Trying to learn the patterns in 10,000 images in one go, can result in model to not learn very well.
Why 32? It's just simply better for model prediction. This is demonstrated by this Wilcon and Martinez's benchmark test

The results show how the smaller the batch size, the better the performance often is. Never go above 32!
To turn our data into catches, we'll create an instance of ImageDataGenerator for each of our datasets.
# create train an dtest data generators and rescale the data
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1/255.)
test_datagen = ImageDataGenerator(rescale=1/255.)
The ImageDataGenerator class helps prepare our images into batches, as well as perform transformations on them as they get loaded into the model.
You might've noticed the rescale parameter, and is an example of transformation we're doing. 255 is the number seen for colour values, or more so it's 'max'. Neural networks only really like to work with values between 0 to 1, and perform it's best when values are scaled in between that. So we'll divide the pixel value by it's max of 255, so even pixel values at max will be 1, fit for neural networks.
We can load our images from their respective directories using the flow_from_directories method.
# turn it into batches
train_data = train_datagen.flow_from_directory(directory=train_dir,
target_size=(224,224),
class_mode='binary',
batch_size=32)
test_data = test_datagen.flow_from_directory(directory=test_dir,
target_size=(224,224),
class_mode='binary',
batch_size=32)
Found 1500 images belonging to 2 classes. Found 500 images belonging to 2 classes.
Now we have 1500 images belonging to 2 classes (pizza and steak), and 500 images to the same 2 classes.
Some things to note:
- Because of the directory structure, the classes are inferenced by the subdirectory names in train/test folders.
- The
target_sizeparameter defines input size based on (width, height) format. - The
class_modevalue of 'binary' defines our classification problem type. If more than two categories, it becomes 'categorical'. - The
batch_sizedefines how many images will be in each batch. We're currently using 32 as default.
We can take a look at our batched images and labels by inspecting the train_data object.
# get a sample of the training data
images, labels = next(train_data) # getting the next batch of images
len(images), len(labels)
(32, 32)
Seems our data is in 32 batches as it should be.
Let's see the image's appearance
# get the first two images
images[:2], images[0].shape
(array([[[[0.56078434, 0.63529414, 0.79215693],
[0.5647059 , 0.6392157 , 0.7960785 ],
[0.5647059 , 0.6392157 , 0.80392164],
...,
[0.07843138, 0.08235294, 0.05882353],
[0.08235294, 0.08235294, 0.07450981],
[0.09803922, 0.09803922, 0.09803922]],
[[0.5647059 , 0.6392157 , 0.7960785 ],
[0.5568628 , 0.6313726 , 0.7960785 ],
[0.5568628 , 0.6313726 , 0.7960785 ],
...,
[0.09803922, 0.10196079, 0.07058824],
[0.0627451 , 0.06666667, 0.04705883],
[0.04313726, 0.04313726, 0.03529412]],
[[0.5686275 , 0.6431373 , 0.8078432 ],
[0.5647059 , 0.6392157 , 0.80392164],
[0.5647059 , 0.6392157 , 0.8078432 ],
...,
[0.07450981, 0.07843138, 0.04705883],
[0.15686275, 0.16078432, 0.13725491],
[0.21568629, 0.21960786, 0.20000002]],
...,
[[0.3921569 , 0.34901962, 0.22352943],
[0.39607847, 0.3529412 , 0.23529413],
[0.3372549 , 0.28235295, 0.1764706 ],
...,
[0.5372549 , 0.5294118 , 0.5803922 ],
[0.5372549 , 0.5294118 , 0.5803922 ],
[0.53333336, 0.5254902 , 0.5764706 ]],
[[0.38431376, 0.34901962, 0.23529413],
[0.34117648, 0.30588236, 0.19215688],
[0.16862746, 0.12941177, 0.03137255],
...,
[0.5372549 , 0.5294118 , 0.58431375],
[0.5372549 , 0.5294118 , 0.58431375],
[0.5411765 , 0.5254902 , 0.5803922 ]],
[[0.17254902, 0.14901961, 0.05490196],
[0.22352943, 0.20000002, 0.10588236],
[0.21176472, 0.18039216, 0.09019608],
...,
[0.5254902 , 0.5137255 , 0.5803922 ],
[0.5294118 , 0.5137255 , 0.57254905],
[0.5294118 , 0.5137255 , 0.5686275 ]]],
[[[0.31764707, 0.39607847, 0.5019608 ],
[0.38431376, 0.46274513, 0.56078434],
[0.34117648, 0.427451 , 0.5176471 ],
...,
[0.31764707, 0.24705884, 0.24705884],
[0.28627452, 0.21176472, 0.21960786],
[0.27058825, 0.19607845, 0.20392159]],
[[0.27450982, 0.29803923, 0.43137258],
[0.3137255 , 0.3372549 , 0.46274513],
[0.3019608 , 0.3372549 , 0.45098042],
...,
[0.32156864, 0.25882354, 0.25882354],
[0.3019608 , 0.2392157 , 0.2392157 ],
[0.28627452, 0.22352943, 0.22352943]],
[[0.28627452, 0.26666668, 0.42352945],
[0.32156864, 0.3137255 , 0.4666667 ],
[0.32941177, 0.32941177, 0.47058827],
...,
[0.30588236, 0.2509804 , 0.24705884],
[0.29803923, 0.24313727, 0.2392157 ],
[0.3137255 , 0.25882354, 0.24705884]],
...,
[[0.18039216, 0.08627451, 0.14901961],
[0.18039216, 0.08235294, 0.15686275],
[0.1764706 , 0.07843138, 0.16078432],
...,
[0.4784314 , 0.47450984, 0.4039216 ],
[0.44705886, 0.4431373 , 0.37254903],
[0.43529415, 0.43137258, 0.36078432]],
[[0.18431373, 0.07843138, 0.13725491],
[0.18039216, 0.08627451, 0.14901961],
[0.1764706 , 0.08235294, 0.14509805],
...,
[0.36862746, 0.3647059 , 0.29411766],
[0.34117648, 0.34509805, 0.27450982],
[0.34901962, 0.3529412 , 0.28235295]],
[[0.18431373, 0.08235294, 0.13333334],
[0.19215688, 0.09019608, 0.14117648],
[0.18431373, 0.09019608, 0.14509805],
...,
[0.5411765 , 0.53333336, 0.47450984],
[0.5294118 , 0.53333336, 0.47058827],
[0.5568628 , 0.56078434, 0.49803925]]]], dtype=float32),
(224, 224, 3))
Images are (224,224,3) from our rescaling, and vary from 0 to 1, from normalization
how about labels?
# view labels
labels
array([0., 1., 1., 0., 1., 0., 1., 1., 0., 1., 0., 0., 1., 1., 1., 0., 0.,
0., 0., 1., 1., 1., 0., 0., 0., 0., 1., 1., 0., 1., 0., 1.],
dtype=float32)
Our class mode is binary, and so labels are either 0 (pizza), and 1 (steak)
Our data is now ready, and we can have the model figure the patterns between image tensors and labels.
3. Create a model (start with a basline)¶
What is our default architecture? Well, there can be many possible answers to it.
A simple trial and error to getting your start on a vision model, is using the architecture that best performs with ImageNet (a large collection of diverse images to benchmark different computer vision models).
But before that, it's good to build a smaller model get a baseline result for you to improve upon.
Note: Deep learning terms for a
small model, is less layers than (SOTA). Basically something with 3-4 layers, while ResNet50 has 50+ layers.
For our small model, we can try the one found in CNN explainer website (model_1 from above), then build a 3 layer convolutional neural network.
# things to import for our model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPool2D, Activation
from tensorflow.keras import Sequential
# Create the model (can be our baseline with 3 convolutional layers)
model_4 = Sequential([
Conv2D(filters=10, kernel_size=3, strides=1, padding='valid', activation='relu', input_shape=(224,224,3)),
# filters = the number of features the model will learn
# padding = determines whether to keep or discard original spatial/image size. valid = shrink down to where kernel_size can apply its filters
Conv2D(10,3,activation='relu'),
Conv2D(10,3,activation='relu'),
Flatten(),
Dense(1, activation='sigmoid')
])
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
This follows a typical CNN structure of:
Input > Conv + ReLU layers (non-linearities) > Pooling layer > Fully connected (dense layer) as Output
Let's discuss some of the components of the Conv2D layer:
- The "2D" means our inputs are two dimensional (height and width), even though they have 3 colour channels, the convolutions are run on each channel invididually.
- filters - these are the number of "feature extractors" that will be moving over our images.
- kernel_size - the size of our filters, for example, a kernel_size of (3, 3) (or just 3) will mean each filter will have the size 3x3, meaning it will look at a space of 3x3 pixels each time. The smaller the kernel, the more fine-grained features it will extract.
- stride - the number of pixels a filter will move across as it covers the image. A stride of 1 means the filter moves across each pixel 1 by 1. A stride of 2 means it moves 2 pixels at a time.
- padding - this can be either 'same' or 'valid', 'same' adds zeros the to outside of the image so the resulting output of the convolutional layer is the same as the input, where as 'valid' (default) cuts off excess pixels where the filter doesn't fit (e.g. 224 pixels wide divided by a kernel size of 3 (224/3 = 74.6) means 2 pixels will get cut off the end, as it leaves a remainder of 2.
Now let's compile the model
# compile the model
model_4.compile(loss='binary_crossentropy',
optimizer=Adam(),
metrics=['accuracy'])
4. Fit a model¶
It's time to fit our mode. But you may notice two extra parameters:
steps_per_epoch> this is the number of batches the model will go through. If we have a batch of 32, and steps is 10, then it will go through a total of 10 batches, aka 320 images. We want it to go through all images in ourtrain_data. (1500 images / 32 batches ~ 47 steps)validation_steps> same as above but for validation data (aka our test folder). 500 images / 32 batches ~ 16 steps.
# check length of training and test data generators
len(train_data), len(test_data)
(47, 16)
# fit the model
history_4 = model_4.fit(train_data,
epochs=5,
steps_per_epoch=len(train_data),
validation_data=test_data,
validation_steps=len(test_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\trainers\data_adapters\py_dataset_adapter.py:121: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_multiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored. self._warn_if_super_not_called()
Epoch 1/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 28s 541ms/step - accuracy: 0.5709 - loss: 1.4794 - val_accuracy: 0.7260 - val_loss: 0.5936 Epoch 2/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 9s 197ms/step - accuracy: 0.7388 - loss: 0.5303 - val_accuracy: 0.8020 - val_loss: 0.4486 Epoch 3/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 10s 205ms/step - accuracy: 0.8686 - loss: 0.3660 - val_accuracy: 0.7660 - val_loss: 0.4861 Epoch 4/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 10s 214ms/step - accuracy: 0.9386 - loss: 0.1896 - val_accuracy: 0.8340 - val_loss: 0.4227 Epoch 5/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 10s 215ms/step - accuracy: 0.9898 - loss: 0.0587 - val_accuracy: 0.8460 - val_loss: 0.4319
5. Evaluate the model¶
Oh yeah! Looks like our model is learning something!
Let's check the training curves
!pip install pandas
Collecting pandas Downloading pandas-2.2.3-cp310-cp310-win_amd64.whl.metadata (19 kB) Requirement already satisfied: numpy>=1.22.4 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from pandas) (2.1.3) Requirement already satisfied: python-dateutil>=2.8.2 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from pandas) (2.9.0.post0) Collecting pytz>=2020.1 (from pandas) Downloading pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB) Collecting tzdata>=2022.7 (from pandas) Downloading tzdata-2025.2-py2.py3-none-any.whl.metadata (1.4 kB) Requirement already satisfied: six>=1.5 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0) Downloading pandas-2.2.3-cp310-cp310-win_amd64.whl (11.6 MB) ---------------------------------------- 0.0/11.6 MB ? eta -:--:-- -------- ------------------------------- 2.4/11.6 MB 13.4 MB/s eta 0:00:01 ------------------ --------------------- 5.2/11.6 MB 12.7 MB/s eta 0:00:01 ---------------------------- ----------- 8.1/11.6 MB 13.2 MB/s eta 0:00:01 ------------------------------------- -- 11.0/11.6 MB 13.5 MB/s eta 0:00:01 ---------------------------------------- 11.6/11.6 MB 13.0 MB/s eta 0:00:00 Downloading pytz-2025.2-py2.py3-none-any.whl (509 kB) Downloading tzdata-2025.2-py2.py3-none-any.whl (347 kB) Installing collected packages: pytz, tzdata, pandas ---------------------------------------- 0/3 [pytz] ---------------------------------------- 0/3 [pytz] ---------------------------------------- 0/3 [pytz] ------------- -------------------------- 1/3 [tzdata] ------------- -------------------------- 1/3 [tzdata] ------------- -------------------------- 1/3 [tzdata] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] -------------------------- ------------- 2/3 [pandas] ---------------------------------------- 3/3 [pandas] Successfully installed pandas-2.2.3 pytz-2025.2 tzdata-2025.2
# plot the training curves
import pandas as pd
pd.DataFrame(history_4.history).plot(figsize=(10,7));
accuracy almost seems like it's reach 100% accuracy after 5 epochs. But val_accuracy has some struggles maintaining accuracy. There's a possibility the model is overfitting.
Let's separate the loss curves from accuracy to get a clearer picture
# plot the loss and accuracy data separately
def plot_loss_curves(history):
"""
Returns separate loss curves for training and validation metrics.
"""
loss = history.history['loss']
val_loss = history.history['val_loss']
accuracy = history.history['accuracy']
val_accuracy = history.history['val_accuracy']
epochs = range(len(history.history['loss']))
# plot loss
plt.figure()
plt.plot(epochs, loss, label='Training Loss')
plt.plot(epochs, val_loss, label='Validation Loss')
plt.title('Loss')
plt.xlabel('Epochs')
plt.legend();
# plot accuracy
plt.figure()
plt.plot(epochs, accuracy, label='Training Accuracy')
plt.plot(epochs, val_accuracy, label='Validation Accuracy')
plt.title('Accuracy')
plt.xlabel('Epochs')
plt.legend();
# check out loss and accuracy curves in model_4
plot_loss_curves(history_4)
The ideal position of validation, is trailing behind training in terms of accuracy and loss. If you notice that it's growing a big gap overtime, its likely overfitting.
# check model's architecture
model_4.summary()
Model: "sequential_4"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d_4 (Conv2D) │ (None, 222, 222, 10) │ 280 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_5 (Conv2D) │ (None, 220, 220, 10) │ 910 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_6 (Conv2D) │ (None, 218, 218, 10) │ 910 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten_4 (Flatten) │ (None, 475240) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_10 (Dense) │ (None, 1) │ 475,241 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 1,432,025 (5.46 MB)
Trainable params: 477,341 (1.82 MB)
Non-trainable params: 0 (0.00 B)
Optimizer params: 954,684 (3.64 MB)
6. Adjust the model parameters¶
There's 3 steps to fitting a ML model:
- Create a baseline
- Beat the baseline by overfitting a larger model
- Reduce overfitting
We've done steps 1 and 2, but there are other ways to continue overfitting:
- Increase the number of convoluional layers
- Increase the number of convolutional filters
- Add another dense layer to the output of our flattened layer
Our focus now, is to get the training and validation training curves closer together, aka reduce overfitting.
But why care about overfitting if training accuracy is so good? If a model performs so well at training data, but not so well in validation, the model has a porr ability at prediction with unseen data. It's not able to generalize and learn about the real world patterns, but instead learning patterns that aren't supposed to be patterns in the training data.
So for the next few models we build, we'll adjust the number of parameters and inspect the training curves as well.
We'll build 2 models:
- A CNN with max pooling
- A CNN with mac pooling and data augmentation
For the first model, we'll follow this CNN structure:
Input > Conv layers + ReLU layers > Max pooling layers > Fully connected (Dense) layers as output
This model will have the same structure as model_4, but with Max Pooling layer included
# create the model (a 3 layer convolutional neural network)
model_5 = Sequential([
Conv2D(10,3,activation='relu',input_shape=(224,224,3)),
MaxPool2D(pool_size=2), # reduces number of features by half
Conv2D(10,3,activation='relu'),
MaxPool2D(), # doing this for every CNN layer
Conv2D(10,3,activation='relu'),
MaxPool2D(),
Flatten(),
Dense(1,activation='sigmoid')
])
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
# compile the model
model_5.compile(loss='binary_crossentropy',
optimizer=Adam(),
metrics=['accuracy'])
# fit the model
history_5 = model_5.fit(train_data,
epochs=5,
steps_per_epoch=len(train_data),
validation_data=test_data,
validation_steps=len(test_data))
Epoch 1/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 25s 491ms/step - accuracy: 0.6468 - loss: 0.6387 - val_accuracy: 0.8000 - val_loss: 0.4548 Epoch 2/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 6s 130ms/step - accuracy: 0.7494 - loss: 0.5103 - val_accuracy: 0.8000 - val_loss: 0.4095 Epoch 3/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 6s 132ms/step - accuracy: 0.8167 - loss: 0.4251 - val_accuracy: 0.8260 - val_loss: 0.3699 Epoch 4/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 6s 132ms/step - accuracy: 0.8250 - loss: 0.3963 - val_accuracy: 0.8360 - val_loss: 0.3585 Epoch 5/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 6s 132ms/step - accuracy: 0.8144 - loss: 0.3979 - val_accuracy: 0.8740 - val_loss: 0.3554
It seems model_5 is performing worse on training data, but is better on validation data instead.
Before checking training curves, let's see the architecture
# check model architecture
model_5.summary()
Model: "sequential_5"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d_7 (Conv2D) │ (None, 222, 222, 10) │ 280 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_2 (MaxPooling2D) │ (None, 111, 111, 10) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_8 (Conv2D) │ (None, 109, 109, 10) │ 910 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_3 (MaxPooling2D) │ (None, 54, 54, 10) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_9 (Conv2D) │ (None, 52, 52, 10) │ 910 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_4 (MaxPooling2D) │ (None, 26, 26, 10) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten_5 (Flatten) │ (None, 6760) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_11 (Dense) │ (None, 1) │ 6,761 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 26,585 (103.85 KB)
Trainable params: 8,861 (34.61 KB)
Non-trainable params: 0 (0.00 B)
Optimizer params: 17,724 (69.24 KB)
Notice how MaxPooling2D is halving the output shape everytime it's applied, while trying to extract the most important data within the 2x2 square before shrinking down.
The bigger the pool_size, the more that max pooling will squeeze the features out of the image. But if too big, there wouldn't be enough features to keep, thus learning nothing.
It's a major reduction in trainable parameters. To 8,861 from 477,431 in model_4
Time to check loss curves
# plot loss curves of model_5 results
plot_loss_curves(history_5)
We can see the curves are closer to each other, but the validation loss looks like it's flattening out, risking the potential of overfitting again.
Time to try other overfitting prevention, and also data augmentation.
We will first observe how it's done with code, then explain it's guts. For data aufmentation, we'll need to reinstate ImageDataGenerator instance.
# create ImageDataGenerator training instance with data augmentation
train_datagen_augmentation = ImageDataGenerator(rescale=1/255.,
rotation_range=20, # rotate the image between 0 and 20 degrees,
shear_range=0.2, # shear the image,
zoom_range=0.2, # zoom into the image,
width_shift_range=0.2, # shift the image width ways,
height_shift_range=0.2, # shift the image height ways,
horizontal_flip=True) # flipping image in horizontal axis
# create imagedatagenerator training instance without data augmentation
train_datagen = ImageDataGenerator(rescale=1/255.)
# create imagedatagenerator test instance without data augmentation
test_datagen = ImageDataGenerator(rescale=1/255.)
Now, what's data augmentation?
It's the process of altering our training data by a slight margin. This gives more diversity in training data, and allows the model to learn more generalizable patterns. Whether that be rotating the image slightly, flipping image horizontally, or slight cropping of the edges.
This helps simulate the data a model may encouter in the real world.
Note: Data augmentation is only ever performed on training data.
ImageDataGeneratordoes augmentation on random lots of images, but leave them that way once training commences. If done on validation/test, we wouldn't be able to properly compare, as random images are changed, every time it's being validated.
# import data and augment it from training directory
print('Augmented training images:')
train_data_augmented = train_datagen_augmentation.flow_from_directory(train_dir,
target_size=(224,224),
batch_size=32,
class_mode='binary',
shuffle=False) # keeping this for demonstration purposes. But it's good to shuffle
# create non-augmented data batches
print('Non-augmented training images:')
train_data = train_datagen.flow_from_directory(train_dir,
target_size=(224,224),
batch_size=32,
class_mode='binary',
shuffle=False)
print('Unchanged test images:')
test_data = test_datagen.flow_from_directory(test_dir,
target_size=(224,224),
batch_size=32,
class_mode='binary',
shuffle='binary')
Augmented training images: Found 1500 images belonging to 2 classes. Non-augmented training images: Found 1500 images belonging to 2 classes. Unchanged test images: Found 500 images belonging to 2 classes.
Let's visualize the augmented, and non augmented data
!pip install scipy
Collecting scipy Downloading scipy-1.15.3-cp310-cp310-win_amd64.whl.metadata (60 kB) Requirement already satisfied: numpy<2.5,>=1.23.5 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from scipy) (2.1.3) Downloading scipy-1.15.3-cp310-cp310-win_amd64.whl (41.3 MB) ---------------------------------------- 0.0/41.3 MB ? eta -:--:-- -- ------------------------------------- 2.6/41.3 MB 12.5 MB/s eta 0:00:04 ----- ---------------------------------- 5.8/41.3 MB 13.6 MB/s eta 0:00:03 -------- ------------------------------- 8.4/41.3 MB 13.3 MB/s eta 0:00:03 ---------- ----------------------------- 11.0/41.3 MB 13.0 MB/s eta 0:00:03 ------------- -------------------------- 13.9/41.3 MB 13.2 MB/s eta 0:00:03 ---------------- ----------------------- 17.0/41.3 MB 13.4 MB/s eta 0:00:02 ------------------- -------------------- 20.2/41.3 MB 13.6 MB/s eta 0:00:02 ---------------------- ----------------- 23.1/41.3 MB 13.5 MB/s eta 0:00:02 ------------------------- -------------- 26.0/41.3 MB 13.6 MB/s eta 0:00:02 --------------------------- ------------ 28.0/41.3 MB 13.2 MB/s eta 0:00:02 ----------------------------- ---------- 30.7/41.3 MB 13.1 MB/s eta 0:00:01 -------------------------------- ------- 33.6/41.3 MB 13.2 MB/s eta 0:00:01 ----------------------------------- ---- 36.4/41.3 MB 13.2 MB/s eta 0:00:01 ------------------------------------- -- 39.1/41.3 MB 13.1 MB/s eta 0:00:01 --------------------------------------- 41.2/41.3 MB 13.0 MB/s eta 0:00:01 ---------------------------------------- 41.3/41.3 MB 12.3 MB/s eta 0:00:00 Installing collected packages: scipy Successfully installed scipy-1.15.3
# get data batch samples
images, labels = next(train_data)
augmented_images, augmented_labels = next(train_data_augmented) # labels aren't augmented by the way
# show original image and augmented image
random_number = random.randint(0,31) # since we're pulling a batch of 32, we'll be choosing between the index a random picture
plt.imshow(images[random_number])
plt.title(f'Original image')
plt.axis(False)
plt.figure()
plt.imshow(augmented_images[random_number])
plt.title(f'Augmented image')
plt.axis(False);
You can see the slight difference between original, and augmented. With the slight warping and cutting and rotating, this forces the model to learn patterns on less ideal images. Which is often the case with real world images.
Data augmentation is a great solution if you find that your model is overfitting too much. As for how strong augmentation should be, there's no set rule for it. It'd be best to check the options in ImageDataGenerator class, and think about your particular use case with a model and its data.
Now let's try refit the model on the augmented images for model_5
# create the model (same as model_5)
model_6 = Sequential([
Conv2D(10,3,activation='relu',input_shape=(224,224,3)),
MaxPool2D(pool_size=2),
Conv2D(10,3,activation='relu'),
MaxPool2D(),
Conv2D(10,3,activation='relu'),
MaxPool2D(),
Flatten(),
Dense(1,activation='sigmoid')
])
# compile model
model_6.compile(loss='binary_crossentropy',
optimizer='Adam',
metrics=['accuracy'])
# fit the model
history_6 = model_6.fit(train_data_augmented,
epochs=5,
steps_per_epoch=len(train_data_augmented),
validation_data=test_data,
validation_steps=len(test_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
Epoch 1/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 20s 404ms/step - accuracy: 0.5125 - loss: 0.8797 - val_accuracy: 0.4920 - val_loss: 0.6938 Epoch 2/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 18s 390ms/step - accuracy: 0.4448 - loss: 0.6950 - val_accuracy: 0.5240 - val_loss: 0.6920 Epoch 3/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 18s 382ms/step - accuracy: 0.5127 - loss: 0.6914 - val_accuracy: 0.5080 - val_loss: 0.6865 Epoch 4/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 18s 381ms/step - accuracy: 0.5620 - loss: 0.6829 - val_accuracy: 0.6200 - val_loss: 0.6837 Epoch 5/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 18s 390ms/step - accuracy: 0.5723 - loss: 0.6834 - val_accuracy: 0.5460 - val_loss: 0.6752
It appears the model didn't get as good of results this time. Why?
It's because we turned off data_shuffling with shuffle=False, which means the model sees the same batches of image over and over.
Our data is organized by folders, and it draws out only the pizza folder first. With it's batches, it only has pizzas to compare to. Not steak. Shuffle helps solve this issue by having that mix of pizza and steak in every batch.
Now knowing what's wrong, we can flip to shuffle=True, as we're done with demonstration purposes.
You can see how data augmentation also increases training time as well. But there's an option to possibly speed up with TensorFlow's parallel reads and buffered prefecting options.
# check model performance
plot_loss_curves(history_6)
Our accuracy is jumping quite a lot. It's heading into the right direction, but we want to aim for idealy a smooth decreasing ascent that reaches close to 1.
Now let's try the suffled augmented data
# import data and augment it from directories
train_data_augmented_shuffled = train_datagen_augmentation.flow_from_directory(train_dir,
target_size=(224,224),
batch_size=32,
class_mode='binary',
shuffle=True)
Found 1500 images belonging to 2 classes.
# create model, same as 5 and 6
model_7 = Sequential([
Conv2D(10,3,activation='relu',input_shape=(224,224,3)),
MaxPool2D(pool_size=2),
Conv2D(10,3,activation='relu'),
MaxPool2D(),
Conv2D(10,3,activation='relu'),
MaxPool2D(),
Flatten(),
Dense(1,activation='sigmoid')
])
# compile model
model_7.compile(loss='binary_crossentropy',
optimizer='Adam',
metrics=['accuracy'])
# fit the model
history_7 = model_7.fit(train_data_augmented_shuffled,
epochs=5,
steps_per_epoch=len(train_data_augmented_shuffled),
validation_data=test_data,
validation_steps=len(test_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs) x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\trainers\data_adapters\py_dataset_adapter.py:121: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_multiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored. self._warn_if_super_not_called()
Epoch 1/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 18s 367ms/step - accuracy: 0.5350 - loss: 0.7032 - val_accuracy: 0.7380 - val_loss: 0.5723 Epoch 2/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 17s 368ms/step - accuracy: 0.7090 - loss: 0.5906 - val_accuracy: 0.8240 - val_loss: 0.4201 Epoch 3/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 17s 363ms/step - accuracy: 0.7435 - loss: 0.5351 - val_accuracy: 0.8160 - val_loss: 0.3851 Epoch 4/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 17s 362ms/step - accuracy: 0.7734 - loss: 0.4889 - val_accuracy: 0.8280 - val_loss: 0.3554 Epoch 5/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 17s 358ms/step - accuracy: 0.7625 - loss: 0.5001 - val_accuracy: 0.8540 - val_loss: 0.3528
# check model's performance history training on augmented data
plot_loss_curves(history_7)
model_7 has shown how it has improved consistently compared to model_6, due to the shuffling done.
Also our loss curve does appear to be more smoother as well.
7. Repeat until satisfied¶
We've beaten the basline quite well, and there are more ways to continue to improve model:
- Increase the model layers (e.g. more CNN layers)
- Increase the filter numbers in each conv layers (e.g. from 10 to 32, 64, to 128. These are usual values for trial and error)
- Train for longer (more epochs)
- Finding an ideal learning rate
- Get more data
- Use transfer learning to leverage what other image models has learnt and adjust for our specific case
Adjusting these settings (except last 2) during development is known as hyperparameter tuning.
Let's go back to model_1 with the TinyVGG architecture
# Create a CNN model (same as Tiny VGG but for binary classification)
model_8 = Sequential([
Conv2D(10,3,activation='relu',input_shape=(224,224,3)),
Conv2D(10,3,activation='relu'),
MaxPool2D(),
Conv2D(10,3,activation='relu'),
Conv2D(10,3,activation='relu'),
MaxPool2D(),
Flatten(),
Dense(1, activation='sigmoid')
])
# compile the model
model_8.compile(loss='binary_crossentropy',
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
# fit the model
history_8 = model_8.fit(train_data_augmented_shuffled,
epochs=5,
steps_per_epoch=len(train_data_augmented_shuffled),
validation_data=test_data,
validation_steps=len(test_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
Epoch 1/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 22s 422ms/step - accuracy: 0.6118 - loss: 0.6556 - val_accuracy: 0.8280 - val_loss: 0.4363 Epoch 2/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 20s 423ms/step - accuracy: 0.7695 - loss: 0.4911 - val_accuracy: 0.8680 - val_loss: 0.3613 Epoch 3/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 20s 419ms/step - accuracy: 0.7753 - loss: 0.4775 - val_accuracy: 0.8860 - val_loss: 0.3220 Epoch 4/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 20s 416ms/step - accuracy: 0.7944 - loss: 0.4477 - val_accuracy: 0.8620 - val_loss: 0.3247 Epoch 5/5 47/47 ━━━━━━━━━━━━━━━━━━━━ 20s 423ms/step - accuracy: 0.8036 - loss: 0.4459 - val_accuracy: 0.8880 - val_loss: 0.2907
Note: You may notice some difference between
model_8andmodel_1, mostly code fromtensorflow.keras.layerstoimport Conv2D. It reduces the amount of code, but does the same thing, calling the same thing.
# check architecture of model_1
model_1.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d (Conv2D) │ (None, 222, 222, 10) │ 280 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_1 (Conv2D) │ (None, 220, 220, 10) │ 910 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d (MaxPooling2D) │ (None, 110, 110, 10) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_2 (Conv2D) │ (None, 108, 108, 10) │ 910 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_3 (Conv2D) │ (None, 106, 106, 10) │ 910 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_1 (MaxPooling2D) │ (None, 53, 53, 10) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten (Flatten) │ (None, 28090) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (Dense) │ (None, 1) │ 28,091 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 93,305 (364.48 KB)
Trainable params: 31,101 (121.49 KB)
Non-trainable params: 0 (0.00 B)
Optimizer params: 62,204 (242.99 KB)
# check architecture of model_8
model_8.summary()
Model: "sequential_11"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d_25 (Conv2D) │ (None, 222, 222, 10) │ 280 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_26 (Conv2D) │ (None, 220, 220, 10) │ 910 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_20 (MaxPooling2D) │ (None, 110, 110, 10) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_27 (Conv2D) │ (None, 108, 108, 10) │ 910 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_28 (Conv2D) │ (None, 106, 106, 10) │ 910 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_21 (MaxPooling2D) │ (None, 53, 53, 10) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten_11 (Flatten) │ (None, 28090) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_17 (Dense) │ (None, 1) │ 28,091 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 93,305 (364.48 KB)
Trainable params: 31,101 (121.49 KB)
Non-trainable params: 0 (0.00 B)
Optimizer params: 62,204 (242.99 KB)
Now let's check our TinyVGG model's performance
# check out the TinyVGG model performance
plot_loss_curves(history_8)
# visually compare with model_1's history
plot_loss_curves(history_1)
Training curve looks good, though improvement isn't that much impressive compared to the previous model. Maybe it's time to give the model more training time, aka more epochs.
Making a prediction with our trained model¶
What good is a trained model, if you can't make predictions with it? Let's upload a couple of our own images to see how the model reacts with it.
# classes we're working with
print(class_names)
['pizza' 'steak']
We'll use this steak image to do our first test
# view our example image
steak=mpimg.imread('03-steak.jpg')
plt.imshow(steak)
plt.axis(False);
# check the shape of our image
steak.shape
(4032, 3024, 3)
Our model only takes the shape of (224, 224, 3), and thus we need to reshape our custom image, to fit the inputs for the model.
We'll need to decode the image with tf.io.read_file (for reading files) and tf.image (for resizing image/turning into tensor).
# create a function to import an image and resize it, so it can be used in model
def load_and_prep_image(filename, img_shape=224):
"""
Reads an image from filename, turns it into a tensor and reshapes it to (img_shape, img_shape, colour channel)
"""
# read in target file (on image)
img = tf.io.read_file(filename)
# decode read file into tensor, and confirm it still has 3 colour channels
# (out model is trained with 3 colour channels, but some images may have 4)
img = tf.image.decode_image(img, channels=3)
# resize image to the size the model has been trained on
img = tf.image.resize(img, size=[img_shape,img_shape])
# rescale image so all values are within 0 and 1
img = img/255.
return img
We now have a function to load custom images for our model. Time to load in the image
# load in and preprocess custom image
steak = load_and_prep_image('03-steak.jpg')
steak
<tf.Tensor: shape=(224, 224, 3), dtype=float32, numpy=
array([[[0.6377451 , 0.6220588 , 0.57892156],
[0.6504902 , 0.63186276, 0.5897059 ],
[0.63186276, 0.60833335, 0.5612745 ],
...,
[0.52156866, 0.05098039, 0.09019608],
[0.49509802, 0.04215686, 0.07058824],
[0.52843136, 0.07745098, 0.10490196]],
[[0.6617647 , 0.6460784 , 0.6107843 ],
[0.6387255 , 0.6230392 , 0.57598037],
[0.65588236, 0.63235295, 0.5852941 ],
...,
[0.5352941 , 0.06862745, 0.09215686],
[0.529902 , 0.05931373, 0.09460784],
[0.5142157 , 0.05539216, 0.08676471]],
[[0.6519608 , 0.6362745 , 0.5892157 ],
[0.6392157 , 0.6137255 , 0.56764704],
[0.65637255, 0.6269608 , 0.5828431 ],
...,
[0.53137255, 0.06470589, 0.08039216],
[0.527451 , 0.06862745, 0.1 ],
[0.52254903, 0.05196078, 0.0872549 ]],
...,
[[0.49313724, 0.42745098, 0.31029412],
[0.05441177, 0.01911765, 0. ],
[0.2127451 , 0.16176471, 0.09509804],
...,
[0.6132353 , 0.59362745, 0.57009804],
[0.65294117, 0.6333333 , 0.6098039 ],
[0.64166665, 0.62990195, 0.59460783]],
[[0.65392154, 0.5715686 , 0.45 ],
[0.6367647 , 0.54656863, 0.425 ],
[0.04656863, 0.01372549, 0. ],
...,
[0.6372549 , 0.61764705, 0.59411764],
[0.63529414, 0.6215686 , 0.5892157 ],
[0.6401961 , 0.62058824, 0.59705883]],
[[0.1 , 0.05539216, 0. ],
[0.48333332, 0.40882352, 0.29117647],
[0.65 , 0.5686275 , 0.44019607],
...,
[0.6308824 , 0.6161765 , 0.5808824 ],
[0.6519608 , 0.63186276, 0.5901961 ],
[0.6338235 , 0.6259804 , 0.57892156]]], dtype=float32)>
Nice, let's test it with the model!
model_8.predict(steak)
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[111], line 1 ----> 1 model_8.predict(steak) File x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\utils\traceback_utils.py:122, in filter_traceback.<locals>.error_handler(*args, **kwargs) 119 filtered_tb = _process_traceback_frames(e.__traceback__) 120 # To get the full stack trace, call: 121 # `keras.config.disable_traceback_filtering()` --> 122 raise e.with_traceback(filtered_tb) from None 123 finally: 124 del filtered_tb File x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\models\functional.py:276, in Functional._adjust_input_rank(self, flat_inputs) 274 adjusted.append(ops.expand_dims(x, axis=-1)) 275 continue --> 276 raise ValueError( 277 f"Invalid input shape for input {x}. Expected shape " 278 f"{ref_shape}, but input has incompatible shape {x.shape}" 279 ) 280 # Add back metadata. 281 for i in range(len(flat_inputs)): ValueError: Exception encountered when calling Sequential.call(). Invalid input shape for input Tensor("data:0", shape=(32, 224, 3), dtype=float32). Expected shape (None, 224, 224, 3), but input has incompatible shape (32, 224, 3) Arguments received by Sequential.call(): • inputs=tf.Tensor(shape=(32, 224, 3), dtype=float32) • training=False • mask=None • kwargs=<class 'inspect._empty'>
Somethings wrong... The image is in the same shape for what the model expects, but there's an extra dimension missing.
This is likely from the batch size dimension > (batch_size, 224, 224, 3).
We can fix this by adding extra dims with tf.expand_dims.
# add an extra axis
print(f'Shape before new dimension: {steak.shape}')
steak = tf.expand_dims(steak, axis=0) # add an extra dimension at axis 0
# steak = steak[tf.newaxis, ...] is the alternative for the above
print(f'Shape after new dimension: {steak.shape}')
steak
Shape before new dimension: (224, 224, 3) Shape after new dimension: (1, 224, 224, 3)
<tf.Tensor: shape=(1, 224, 224, 3), dtype=float32, numpy=
array([[[[0.6377451 , 0.6220588 , 0.57892156],
[0.6504902 , 0.63186276, 0.5897059 ],
[0.63186276, 0.60833335, 0.5612745 ],
...,
[0.52156866, 0.05098039, 0.09019608],
[0.49509802, 0.04215686, 0.07058824],
[0.52843136, 0.07745098, 0.10490196]],
[[0.6617647 , 0.6460784 , 0.6107843 ],
[0.6387255 , 0.6230392 , 0.57598037],
[0.65588236, 0.63235295, 0.5852941 ],
...,
[0.5352941 , 0.06862745, 0.09215686],
[0.529902 , 0.05931373, 0.09460784],
[0.5142157 , 0.05539216, 0.08676471]],
[[0.6519608 , 0.6362745 , 0.5892157 ],
[0.6392157 , 0.6137255 , 0.56764704],
[0.65637255, 0.6269608 , 0.5828431 ],
...,
[0.53137255, 0.06470589, 0.08039216],
[0.527451 , 0.06862745, 0.1 ],
[0.52254903, 0.05196078, 0.0872549 ]],
...,
[[0.49313724, 0.42745098, 0.31029412],
[0.05441177, 0.01911765, 0. ],
[0.2127451 , 0.16176471, 0.09509804],
...,
[0.6132353 , 0.59362745, 0.57009804],
[0.65294117, 0.6333333 , 0.6098039 ],
[0.64166665, 0.62990195, 0.59460783]],
[[0.65392154, 0.5715686 , 0.45 ],
[0.6367647 , 0.54656863, 0.425 ],
[0.04656863, 0.01372549, 0. ],
...,
[0.6372549 , 0.61764705, 0.59411764],
[0.63529414, 0.6215686 , 0.5892157 ],
[0.6401961 , 0.62058824, 0.59705883]],
[[0.1 , 0.05539216, 0. ],
[0.48333332, 0.40882352, 0.29117647],
[0.65 , 0.5686275 , 0.44019607],
...,
[0.6308824 , 0.6161765 , 0.5808824 ],
[0.6519608 , 0.63186276, 0.5901961 ],
[0.6338235 , 0.6259804 , 0.57892156]]]], dtype=float32)>
# make a prediction on custom image
pred = model_8.predict(steak)
pred
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 175ms/step
array([[0.8684933]], dtype=float32)
Our prediction comes out as a probability between pizza and steak. Knowing it's binary classification, 0.5 is the middle between steak and pizza prediction. So with a value of 0.86, is most likely going to be the positive class of 1.
But positive or negative class don't give much clue as to whether it's pizza or steak. So we should write a function to convert the prediction probabilities to the class names.
# we can index the predicted class by rounding the probability
pred_class = class_names[int(tf.round(pred)[0][0])] # the [0][0] selects the value from the 2D tensor, after rounding > [[1.0]] is what the number actually looks like when rounded
pred_class
np.str_('steak')
def pred_and_plot(model, filename, class_names):
"""
Imports an image located at filename, makes prediction with model, and plots image with predicted class as title
"""
# import the target image/preprocess it
img = load_and_prep_image(filename)
# make a prediction
pred = model.predict(tf.expand_dims(img, axis=0))
# get the predicted value
pred_class = class_names[int(tf.round(pred)[0][0])]
# plot the image and predicted label
plt.imshow(img)
plt.title(f"Prediction: {pred_class}")
plt.axis(False);
# test our model with custom image
pred_and_plot(model_8, '03-steak.jpg', class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 83ms/step
Nice, the model got it correct!
# download another image to make a prediction
pred_and_plot(model_8, '03-pizza-dad.jpeg', class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step
Multi-class classification¶
We've referenced TinyVGG architecture from the CNN explainer website. But the CNN explainer worked with 10 categories, over binary classification like ours.
Let's go through the same steps again, but this time, we'll work with 10 different categories of food.

The workflow we're doing is a slightly modified version of the above. As you do more deep learning, the above workflow becomes more like an outline, rather than a step-by-step guide
1. Import and become one with the data¶
Again going back to the Food101 dataset, in addition to the steak and pizza, we'll pull out 8 other categories to satisfy our next challenge.
import zipfile
import urllib.request
# Step 1: Download the zip file 10_food_classes
url = "https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_all_data.zip"
urllib.request.urlretrieve(url, "10_food_classes_all_data.zip") # saves the file locally
# Step 2: Unzip the file
with zipfile.ZipFile("10_food_classes_all_data.zip", "r") as zip_ref:
zip_ref.extractall() # extract all files to the current directory
Now let's check out all the different directories/sub-directories in the 10_food_classes file.
import os
# walk through 10_food_classes directory and list number of files
for dirpath, dirnames, filenames in os.walk('10_food_classes_all_data'):
print(f'There are {len(dirnames)} directories and {len(filenames)} images in "{dirpath}".')
There are 2 directories and 0 images in "10_food_classes_all_data". There are 10 directories and 0 images in "10_food_classes_all_data\test". There are 0 directories and 250 images in "10_food_classes_all_data\test\ice_cream". There are 0 directories and 250 images in "10_food_classes_all_data\test\chicken_curry". There are 0 directories and 250 images in "10_food_classes_all_data\test\steak". There are 0 directories and 250 images in "10_food_classes_all_data\test\sushi". There are 0 directories and 250 images in "10_food_classes_all_data\test\chicken_wings". There are 0 directories and 250 images in "10_food_classes_all_data\test\grilled_salmon". There are 0 directories and 250 images in "10_food_classes_all_data\test\hamburger". There are 0 directories and 250 images in "10_food_classes_all_data\test\pizza". There are 0 directories and 250 images in "10_food_classes_all_data\test\ramen". There are 0 directories and 250 images in "10_food_classes_all_data\test\fried_rice". There are 10 directories and 0 images in "10_food_classes_all_data\train". There are 0 directories and 750 images in "10_food_classes_all_data\train\ice_cream". There are 0 directories and 750 images in "10_food_classes_all_data\train\chicken_curry". There are 0 directories and 750 images in "10_food_classes_all_data\train\steak". There are 0 directories and 750 images in "10_food_classes_all_data\train\sushi". There are 0 directories and 750 images in "10_food_classes_all_data\train\chicken_wings". There are 0 directories and 750 images in "10_food_classes_all_data\train\grilled_salmon". There are 0 directories and 750 images in "10_food_classes_all_data\train\hamburger". There are 0 directories and 750 images in "10_food_classes_all_data\train\pizza". There are 0 directories and 750 images in "10_food_classes_all_data\train\ramen". There are 0 directories and 750 images in "10_food_classes_all_data\train\fried_rice".
Looks good! Now we set up train and test directory path.
train_dir = '10_food_classes_all_data/train/'
test_dir = '10_food_classes_all_data/test/'
Get the class names from subdirectories
# get the class names for our multi-class dataset
import pathlib
import numpy as np
data_dir = pathlib.Path(train_dir)
class_names = np.array(sorted([item.name for item in data_dir.glob('*')])) # * > matching everything inside the directory
print(class_names)
['chicken_curry' 'chicken_wings' 'fried_rice' 'grilled_salmon' 'hamburger' 'ice_cream' 'pizza' 'ramen' 'steak' 'sushi']
# view a random image in the file
import random
img = view_random_image(target_dir=train_dir, target_class=random.choice(class_names)) # gets a random class name
Image shape: (512, 382, 3)
2. Preprocess the data (prepare it for a model)¶
After going through some images (10-100), looks like everything is set up correctly.
Time to preprocess the data
from tensorflow.keras.preprocessing.image import ImageDataGenerator
# rescale the data and create data generator instances
train_datagen = ImageDataGenerator(rescale=1/255.)
test_datagen = ImageDataGenerator(rescale=1/255.)
# load data into directory, and turn them into batches
train_data = train_datagen.flow_from_directory(train_dir,
target_size=(224,224),
batch_size=32,
class_mode='categorical')
test_data = test_datagen.flow_from_directory(test_dir,
target_size=(224,224),
batch_size=32,
class_mode='categorical')
Found 7500 images belonging to 10 classes. Found 2500 images belonging to 10 classes.
The main change from our binary classifier, to 10 categories, is changing class_mode from class_mode to categorical. Everything else stays the same.
Question: But why do we set our image as 224x224? It's often a very common default size for preprocessing images. But depends on your problem, whether a bigger or smaller image is needed.
3. Create a model (start with a baseline)¶
We can use the same model (TinyVGG) we've used for binary classification problem, for our multi-class classification problem with a couple of small tweaks.
Namely:
- Changing the output to have 10 output neurons (the same number of categories).
- Changing the output layer to use
softmaxoversigmoid. - Changing the loss function to be
categorical_crossentropyoverbinary_crossentropy.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten, Dense
# create our model (clone of model_8, but for classification)
model_9 = Sequential([
Conv2D(10,3,activation='relu',input_shape=(224,224,3)),
Conv2D(10,3,activation='relu'),
MaxPool2D(),
Conv2D(10,3,activation='relu'),
Conv2D(10,3,activation='relu'),
MaxPool2D(),
Flatten(),
Dense(10,activation='softmax')
])
# compile the model
model_9.compile(loss='categorical_crossentropy',
optimizer='Adam',
metrics=['accuracy'])
4. Fit a model¶
Now we've got a model fit for multi classes. Let's start fitting it
# fit the model
history_9 = model_9.fit(train_data,
epochs=5,
steps_per_epoch=len(train_data),
validation_data=test_data,
validation_steps=len(test_data))
Epoch 1/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 84s 351ms/step - accuracy: 0.1707 - loss: 2.2480 - val_accuracy: 0.2756 - val_loss: 1.9998 Epoch 2/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 48s 205ms/step - accuracy: 0.3320 - loss: 1.9258 - val_accuracy: 0.2648 - val_loss: 2.0030 Epoch 3/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 44s 186ms/step - accuracy: 0.4773 - loss: 1.5691 - val_accuracy: 0.2796 - val_loss: 2.0894 Epoch 4/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 43s 182ms/step - accuracy: 0.7336 - loss: 0.8674 - val_accuracy: 0.2568 - val_loss: 2.6161 Epoch 5/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 43s 181ms/step - accuracy: 0.9222 - loss: 0.3081 - val_accuracy: 0.2904 - val_loss: 3.7356
It's noticeably longer to train the model, despite the same epoch of our binary class model. This is due to the amount of images the model has to process through compared to binary. Train has 750 images, and test has 250 images. Our binary had 2 classes, but this one has 10 classes. Our model above is dealing with 5 times as much data as the binary.
5. Evaluate the model¶
Yay we've trained the model :) Let's see it visually
# evaluate on test data
model_9.evaluate(test_data)
79/79 ━━━━━━━━━━━━━━━━━━━━ 8s 95ms/step - accuracy: 0.3031 - loss: 3.6326
[3.735577344894409, 0.2903999984264374]
# check model curves
plot_loss_curves(history_9)
Hmm, pretty poor results on our training and validation loss curves. So what does this say?
It looks like the model has overfit our training accuracy, and is poorly generalizing/predicting on data it's unfamiliar with.
6. Adjust the model parameters¶
It's clear that the model is learning something, but it's not in the intended direction. We ideally want our validation to perform as well as training. So our next steps is to try and prevent overfitting from occuring.
- Get more data - Simplest but hardest answer. It gives more opportunity for the model to learn patterns
- Simplify model - If the model overfits, it can mean the model is too complicated. It's learning patterns too well, or learning patterns that aren't really patterns. Which makes it hard to generalize on unseen data. We can reduce layers or the number of hidden units.
- Use data augmentation - Manipulating the training data slightly, which makes learning harder/adds more variety in data. If the model can learn from augmented data, then it may be able to generalize better on unseen data.
- Use transfer learning - We can leverage an already trained model, who's recognized patterns in similar data, into the foundation of our own task. Such as using a computer vision model, and tweak it slightly to suit our food data.
For now, lets simplify the model first. We'll remove two conv layer, so our layers go from four to two.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten, Dense
# create our model (clone of model_8, but for classification)
model_10 = Sequential([
Conv2D(10,3,activation='relu',input_shape=(224,224,3)),
MaxPool2D(),
Conv2D(10,3,activation='relu'),
MaxPool2D(),
Flatten(),
Dense(10,activation='softmax')
])
# compile the model
model_10.compile(loss='categorical_crossentropy',
optimizer='Adam',
metrics=['accuracy'])
# fit the model
history_10 = model_10.fit(train_data,
epochs=5,
steps_per_epoch=len(train_data),
validation_data=test_data,
validation_steps=len(test_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(activity_regularizer=activity_regularizer, **kwargs)
Epoch 1/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 39s 160ms/step - accuracy: 0.1917 - loss: 2.1982 - val_accuracy: 0.2984 - val_loss: 1.9639 Epoch 2/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 35s 150ms/step - accuracy: 0.4483 - loss: 1.6929 - val_accuracy: 0.3288 - val_loss: 1.9280 Epoch 3/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 36s 152ms/step - accuracy: 0.6271 - loss: 1.2168 - val_accuracy: 0.2868 - val_loss: 2.1241 Epoch 4/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 37s 158ms/step - accuracy: 0.7917 - loss: 0.7238 - val_accuracy: 0.2924 - val_loss: 2.4491 Epoch 5/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 37s 157ms/step - accuracy: 0.9233 - loss: 0.3336 - val_accuracy: 0.2752 - val_loss: 3.0246
# check out loss curves of model_10
plot_loss_curves(history_10)
Well it seems our simplified model didn't work. Maybe we try data augmentation?
To do this, we need ImageDataGenerator instance. This time we'll add parameters such as rotation_range and horizontal_flip to manipulate our images.
# create augmented data generator instance
train_datagen_augmented = ImageDataGenerator(rescale=1/255.,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
train_data_augmented = train_datagen_augmented.flow_from_directory(train_dir,
target_size=(224,224),
batch_size=32,
class_mode='categorical')
Found 7500 images belonging to 10 classes.
After augmentation, we can try again with model_10. We don't have to rewrite the model, but use a handy tf function clone_model, which takes an existing model and rebuilt it in the same format.
The cloned model doesn't transport the learning the original model has done. So you basically train on the model with a clean slate.
Note: Key practices in deep learning is to be a serial experimenter. Trying something, see if it works, then try something else. A good experimenter also keeps track of all the things that were changed per step, and what results came of it. For our example, it's augmenting the data, and trying it on our previous model, to see if anything is changed from what we see on the loss curves
# clone the model
model_11 = tf.keras.models.clone_model(model_10)
# compile model
model_11.compile(loss='categorical_crossentropy',
optimizer='Adam',
metrics=['accuracy'])
# fit the model
history_11 = model_11.fit(train_data_augmented,
epochs=5,
steps_per_epoch=len(train_data_augmented),
validation_data=test_data,
validation_steps=len(test_data))
Epoch 1/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 104s 439ms/step - accuracy: 0.1404 - loss: 2.4096 - val_accuracy: 0.2552 - val_loss: 2.0999 Epoch 2/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 104s 444ms/step - accuracy: 0.2399 - loss: 2.1277 - val_accuracy: 0.2788 - val_loss: 2.0456 Epoch 3/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 101s 429ms/step - accuracy: 0.2750 - loss: 2.0650 - val_accuracy: 0.3332 - val_loss: 1.9077 Epoch 4/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 101s 429ms/step - accuracy: 0.2827 - loss: 2.0143 - val_accuracy: 0.3288 - val_loss: 1.9321 Epoch 5/5 235/235 ━━━━━━━━━━━━━━━━━━━━ 102s 435ms/step - accuracy: 0.3084 - loss: 1.9931 - val_accuracy: 0.3700 - val_loss: 1.8488
You can see how much longer it takes, due to augmentation of images while training, and done using CPU rather than GPU.
Note: One way to imporve time taken, is use
tf.keras.layers.RandomFlipto flip images horizontally. Data loading can be speed up withtf.keras.utils.image_dataset_from_directory, which is an image loading API. (Will be covered later on)
So how's the model curves?
# check out model performance of model_11
plot_loss_curves(history_11)
Performance definitely looks better! Loss curves are better, and although augmented training test is quite low, validation dataset performed much better this time. Looks as if the model may continue to improve at a constant rate beyond our 5 epochs.
7. Repeat until satified¶
We can continue with things like restructuring model architevture, add more layers, adjust learning rate, different methods of augmentation etc. As you can guess, it takes a long ass time.
Good thing, there's the trick of transfer learning
We'll save that for the next notebook. But in the meantime, let's make a prediction with our trained multi-class model.
Making a prediction with our trained model¶
What good is a model, if you can't make predictions with it?
Let's remind ourselves the categories we're dealing with in the 10 categorical food items, before adding some custom images.
# what are our class names?
class_names
array(['chicken_curry', 'chicken_wings', 'fried_rice', 'grilled_salmon',
'hamburger', 'ice_cream', 'pizza', 'ramen', 'steak', 'sushi'],
dtype='<U14')
Let's get some custom images.
And nowq we'll use pred_and_plot function to make prediction with model_11 on an image, and see it's output.
# make prediction on model_11 with custom image
pred_and_plot(model=model_11,
filename='03-steak.jpeg',
class_names=class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 1s/step
Not correct, but let's try another image
pred_and_plot(model=model_11,
filename='03-sushi.jpeg',
class_names=class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 86ms/step
still chicken curry :(
pred_and_plot(model_11, '03-pizza-dad.jpeg', class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 75ms/step
lmao chicken curry
maybe there's something wrong with the pred_and_plot function. Let's make a prediction outside of that function instead.
# load in and preprocess our custom image
img = load_and_prep_image('03-steak.jpeg')
# make prediction
pred = model_11.predict(tf.expand_dims(img,axis=0))
# match the prediction class to the highest prediction probability
pred_class=class_names[pred.argmax()]
plt.imshow(img)
plt.title(pred_class)
plt.axis(False);
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 65ms/step
The prediction is wrong, but now it's different.
The issue with our previous function, is most likely since it's only used for binary classification, and has no capability for handling multi classification data. Main problem lies in the prediction function.
# check the output of the predict function
pred = model_11.predict(tf.expand_dims(img,axis=0))
pred
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 75ms/step
array([[0.03498406, 0.112686 , 0.070467 , 0.21361639, 0.08432788,
0.15790951, 0.02166959, 0.07102852, 0.19622356, 0.03708745]],
dtype=float32)
our model has softmax activation function with 10 output neurons, where each neurons is outputted a prediction probability.
We can use argmax to find through class_names, what was deemed most likely to be the food item from the model.
# find the predicted class name
class_names[pred.argmax()]
np.str_('grilled_salmon')
Knowing that, we can readjust pred_and_plot function to work with multiple classes as well as binary classes.
# adjust function to work with multi-class
def pred_and_plot(model, filename, class_names):
'''
Imports an image located at filename, makes a prediction on it with
a trained model and plots the image with the predicted class as the title.
'''
# import the target image and preprocess it
img = load_and_prep_image(filename)
# make a prediction
pred = model.predict(tf.expand_dims(img,axis=0))
# get the predicted class
if len(pred[0]) > 1: # check for multiclass
pred_class = class_names[pred.argmax()] # if more than one output, take the max value
else:
pred_class = class_names[int(tf.round(pred)[0][0])] # if only one output, round up value that's > 0.5
# plot the image and predicted class
plt.imshow(img)
plt.title(f'Prediction: {pred_class}')
plt.axis(False);
let's try it out, now that it shouldn't show chicken curry
pred_and_plot(model_11, '03-steak.jpeg', class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 111ms/step
pred_and_plot(model_11, "03-sushi.jpeg", class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 41ms/step
pred_and_plot(model_11, "03-pizza-dad.jpeg", class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step
pred_and_plot(model_11, "03-hamburger.jpeg", class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step
unfortunate that the predictions aren't accurate, and only having around 35% accuracy in total with validation data.
We will improve on this later on through 'transfer learning'
Saving and loading our model¶
Once model is trained, you probably want to save it and load it elsewhere.
To do so, we can use save and load_model functions.
# save a model
model_11.save('saved_trained_model.keras')
# load in a model and evaluate it
loaded_model_11 = tf.keras.models.load_model('saved_trained_model.keras')
loaded_model_11.evaluate(test_data)
79/79 ━━━━━━━━━━━━━━━━━━━━ 27s 326ms/step - accuracy: 0.3824 - loss: 1.8630
[1.848818302154541, 0.3700000047683716]
# compare saved model to unsaved model
model_11.evaluate(test_data)
79/79 ━━━━━━━━━━━━━━━━━━━━ 6s 76ms/step - accuracy: 0.3665 - loss: 1.8623
[1.8488181829452515, 0.3700000047683716]
Exercises¶
- Check out CNN explainer website, and note the keyy terms/explain convolution, pooling, etc. in your own words.
What is CNN?¶
Its a type of classifier, suited to image prediction. You have an input of a tensor (n-dimensions of a matrix/dataset), with neurons that have inputs, which all yield to a singular output. These can then create a filter that's applied on top of the image to enhance or select important features of the image. In addition to weights and biases on every neuron, to tweak if prediction from model is off.
How does a convolutional layer work?¶
It starts with the kernel size, aka a grid size, with weights attached to it, to help it discern important details of an image. The weights apply a multiplication value to each pixel in the set grid, say 3x3, and all pixel values that's multiplied by weight, is then outputted as a single value when added together. This becomes the new value of our first pixel, and this continues for the next pixel. This is known as a dot product of 1. If its 2, it moves to every 2 pixels.
Due to images having rgb channels, making 3 separate images, the 3 image values are added up together as a final output, but with bias added, based on what the model think will help prediction accuracy.
Each of these filters are there to detect specific features, such as edges, eyes, curves, etc.
ReLU and Softmax¶
ReLU is widely used for its non-linearity predictions. Which makes it great for complex models, because not every prediction is a linear straight line. How it does it, is it keeps all positive values the same, but returns all negative values as 0.
Softmax is typically used on multi classification problems. When there is an output of more than 2 classes, they don't add up neatly to 1. Softmax does a sort of normalization/regularization to all the values, so that all classes will add up to 1, aka 100%.
Max Pooling¶
Operates by shrinking image size, but simultaneously keeping the most important information within. Typically a 2x2 grid area, it often looks for the largest value in that grid area/kernel slice. Then can either move next pixel, or second pixel etc.
pip install nbconvert PyPDF2
Requirement already satisfied: nbconvert in x:\anaconda3\lib\site-packages (7.10.0) Requirement already satisfied: PyPDF2 in x:\anaconda3\lib\site-packages (3.0.1) Requirement already satisfied: beautifulsoup4 in x:\anaconda3\lib\site-packages (from nbconvert) (4.12.2) Requirement already satisfied: bleach!=5.0.0 in x:\anaconda3\lib\site-packages (from nbconvert) (4.1.0) Requirement already satisfied: defusedxml in x:\anaconda3\lib\site-packages (from nbconvert) (0.7.1) Requirement already satisfied: jinja2>=3.0 in x:\anaconda3\lib\site-packages (from nbconvert) (3.1.3) Requirement already satisfied: jupyter-core>=4.7 in x:\anaconda3\lib\site-packages (from nbconvert) (5.5.0) Requirement already satisfied: jupyterlab-pygments in x:\anaconda3\lib\site-packages (from nbconvert) (0.1.2) Requirement already satisfied: markupsafe>=2.0 in x:\anaconda3\lib\site-packages (from nbconvert) (2.1.3) Requirement already satisfied: mistune<4,>=2.0.3 in x:\anaconda3\lib\site-packages (from nbconvert) (2.0.4) Requirement already satisfied: nbclient>=0.5.0 in x:\anaconda3\lib\site-packages (from nbconvert) (0.8.0) Requirement already satisfied: nbformat>=5.7 in x:\anaconda3\lib\site-packages (from nbconvert) (5.9.2) Requirement already satisfied: packaging in x:\anaconda3\lib\site-packages (from nbconvert) (23.1) Requirement already satisfied: pandocfilters>=1.4.1 in x:\anaconda3\lib\site-packages (from nbconvert) (1.5.0) Requirement already satisfied: pygments>=2.4.1 in x:\anaconda3\lib\site-packages (from nbconvert) (2.15.1) Requirement already satisfied: tinycss2 in x:\anaconda3\lib\site-packages (from nbconvert) (1.2.1) Requirement already satisfied: traitlets>=5.1 in x:\anaconda3\lib\site-packages (from nbconvert) (5.7.1) Requirement already satisfied: six>=1.9.0 in x:\anaconda3\lib\site-packages (from bleach!=5.0.0->nbconvert) (1.16.0) Requirement already satisfied: webencodings in x:\anaconda3\lib\site-packages (from bleach!=5.0.0->nbconvert) (0.5.1) Requirement already satisfied: platformdirs>=2.5 in x:\anaconda3\lib\site-packages (from jupyter-core>=4.7->nbconvert) (3.10.0) Requirement already satisfied: pywin32>=300 in x:\anaconda3\lib\site-packages (from jupyter-core>=4.7->nbconvert) (305.1) Requirement already satisfied: jupyter-client>=6.1.12 in x:\anaconda3\lib\site-packages (from nbclient>=0.5.0->nbconvert) (7.4.9) Requirement already satisfied: fastjsonschema in x:\anaconda3\lib\site-packages (from nbformat>=5.7->nbconvert) (2.16.2) Requirement already satisfied: jsonschema>=2.6 in x:\anaconda3\lib\site-packages (from nbformat>=5.7->nbconvert) (4.19.2) Requirement already satisfied: soupsieve>1.2 in x:\anaconda3\lib\site-packages (from beautifulsoup4->nbconvert) (2.5) Requirement already satisfied: attrs>=22.2.0 in x:\anaconda3\lib\site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (23.1.0) Requirement already satisfied: jsonschema-specifications>=2023.03.6 in x:\anaconda3\lib\site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (2023.7.1) Requirement already satisfied: referencing>=0.28.4 in x:\anaconda3\lib\site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (0.30.2) Requirement already satisfied: rpds-py>=0.7.1 in x:\anaconda3\lib\site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (0.10.6) Requirement already satisfied: entrypoints in x:\anaconda3\lib\site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (0.4) Requirement already satisfied: nest-asyncio>=1.5.4 in x:\anaconda3\lib\site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (1.6.0) Requirement already satisfied: python-dateutil>=2.8.2 in x:\anaconda3\lib\site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (2.8.2) Requirement already satisfied: pyzmq>=23.0 in x:\anaconda3\lib\site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (24.0.1) Requirement already satisfied: tornado>=6.2 in x:\anaconda3\lib\site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (6.3.3) Note: you may need to restart the kernel to use updated packages.